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(54) DNA CHAIN USEFUL FOR XANTHOPHYLL SYNTHESIS AND PROCESS FOR PRODUCING 
XANTHOPHYLLS 

(57) The following DNA chains relate to xantho- 
phylls having a keto group, represented by astaxanthin. 
and the following technique relates to a genetically engi- 
neered production of xanthophylls: a DNA chain having 
a base sequence coding for a polypeptide having an 
enzymatic activity of converting the 4-methylene group 
of p-ionone ring into a keto group; a DNA chain having 
a base sequence coding for a polypeptide having an 
enzymatic activity of converting the 4-methylene group 
of a 3-hydroxy-p-ionone ring into a keto group; a DNA 
chain having a base sequence coding for a polypeptide 
having an enzymatic activity of adding a hydroxyl group 
to the 3-carbon atom of a 4-keto-p-ionone ring; and a 
process for producing various xanthophylls. such a can- 
thaxanthin and astaxanthin. by introducing the above 
DNA chain(s) into a suitable microorganism, e.g., 
Escherichia coli . followed by expression thereof. 
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Description 
Technical Field 

The present invention relates to DNA strands useful for the synthesis of keto group-containing xanthophylls (keto- 
carotenoids) such as astaxanthin which are useful for heightening the color of cultured fishes and shellfishes such as 
sea breams, saimons. lobster and the tike and is used for foods as a coloring agent and an antioxidant, and to a process 
for producing keto group-containing xanthophylls (ketocarotlnoids) such as astaxanthin with use of a microorganism 
into which the DNA strands tiave been introduced. 

' Background Art 



The term xanthophylls mean carotenoid pigments having an oxygen-containing group such as a hydroxyl group, a 
keto group or an epoxy group. Carotenoids are synthesized fyy the isoprenoid biosynthetic process which is used in 

75 common halfway with steroids and other terpenoids with mevalonic acid as a starting material. C15 farnesyl pyrophos- 
phate (FPP) resulting from isoprene basic biosynthetic pathway is condensed with C5 isopentenyl pyrophosphate (IPP) 
to give C20 geranylgeranyl pyrophosphate (GGPP). Two molecules of GGPP are condensed to synthesize a colorless 
phytoene as an initial carotenoid. The phytoene is converted into phytofluene. i;-carotene. neurosporene and then lyc- 
opene by a series of desaturation reactions, and lycopene is in turn converted into p-carotene by the cyclization reac- 

20 tion. It is believed that a variety of xanthophylls are synthesized by introducing a hydroxyl group or a keto group into the 
p-carotene (See Britton. G., "Biosynthesis of Carotenoids"; Plant Pigments. Goodwin. TW. ed.. London. Academic 
Press. 1988. pp. 133-182). 

The present inventors have recfently made it possible to clone a carotenoid biosynthesis gene cluster from a epi- 
phytic non-photosynthetic bacterium Erwinia uredovora in Escherichia coli with an index of the yellow tone of the bac- 

25 terium, a variety of combinations of the genes being expressed in rnicroprganisms such as Escherichia coli to produce 
phytoene, lycopene, p-carotene, - an'd zeaxanthin which is a derivative of p-carotene into which hydroxyl groups have 
been introduced (See Fig. 10; Misawa. N., Nakagawa, M.. Kobayashi. K., Yamano. S.. Izawa. Y. Nakamura, K.. Har- 
ashima, K.; "Elucidation of the Erwinia uredovora Carotenoid biosynthetic Pathway by Functional Analysis of Gene 
Products Expressed in Escherichia coli ". J. BacterioL. 172. p.6704-6712. 1990; Misawa, N., Yamano. S.. Ikenaga. H.. 

30 "Production of p-carotene in Zymomonas mobilis and Agrobacterium tumefaciencs by Introduction of the Biosynthesis 
Genes from Enwinia uredovora ". Appl. environ. Microbiol.. 57. p. 1647-1849, 1991; and Japanese Patent Application 
No. 58786/1991 (Japanese Patent Application No. 53255/1990): "DNA Strands useful for the Synthesis of Caroten- 
oids"). 

On the other hand, astaxanthin. a red xanthophyll. is a typical animal carotenoid which occurs particularly in a wide 

35 variety of marine animals including red fishes such as a sea bream and a salmon, and crustaceans such as a crab and 
a lobster. In general, animals cannot biosynthesize carotenoids, so that it is necessary for them to ingest carotenoids 
synthesized by microorganisms or plants from their environments. Thus, astaxanthin has hitherto been used widely for 
strengthening the color of ctiltured fishes and shellfishes such as a sea bream, a salmon, a lobster and the like. More- 
over, astaxanthin has attracted attention not only as a coloring matter in foods but also as an anti-oxidant for removing 

40 active oxygen generated in bodies, which causes carcinoma (see Takao Matsuno ed., "Physiological Functions and 
Bioactivities of Carotenoids in Animals", Kagaku to Seibutsu. 28, p. 219-227, 1990). As the sources of astaxanthin. 
there have been known crustaceans such as a krill In the Antarctic Ocean, cultured products of a yeast Phaffia . cultured 
products of a green alga Haematococcus. and products obtained by the organic synthetic methods. However, when 
crustaceans such as a krill in the Antarctic Ocean or the like are used. K requires laborious works and much expenses 

45 for the isolation of astaxantin from contaminants such as lipids and the like during the harvesting and extraction of the 
krill. Moreover, in the case of the cultured product of the yeast Phaffia . a great deal of expenses are required for the 
gathering and extraction of astaxanthin, since the yeast has rigid cell walls and produces astaxanthin only in a low yield. 
Also, in the case of the cultured product of the green alga Haematococcus . not only a location for collecting sunlight or 
an investment of a culturing apparatus for supplying an artificial light is required in order to supply light which is essen- 

50 tial to the synthesis of astaxantin, but also it is difficult to separate astaxanthin from fatty acid esters as by-products or 
chlorophylls present in the cultured products. From these reasons, astaxanthin produced from biological sources is in 
the present situation inferior to that obtained by the organic synthetic methods on the basis of cost. The organic syn- 
thetic methods however have a problem of by-products produced during the reactions in consideration of its use as a 
feed for fishes and shellfishes and an additive to foods, and the products obtained by the organic synthetic methods are 

55 opposed to the consumer's preference for natural products. Thus, it has been desired to supply an inexpensive astax- 
anthin which is safe and produced from biological sources and thus has a good image to consumers and to develop a 
process for producing the astaxanthin. 
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Disclosure o1 the Invention 



It would be ponsidered very useful to find a group of genes for playing a role of the biosynthesis of astaxanthin. 
because it is possible to afford astaxanthin-producing ability to a microorganism optimum in safety as a food or in polen- 

£ tiality for producing astaxanthin. regardless of the presence of astaxanthin-producing ability, by introducing a gene dime- 
ter for astaxanthin biosynthesis into the microorganism. No problem of by-products as contaminants is caused in this 
case, so that it would be considered not so difficult to increase the production amount of astaxanthin with a recent 
advanced technique of gene manipulation to a level higher than that accomplished by the organic synthetic methods. 
However, the groups of genes for synthesizing zeaxanthin. one of the xanthophylls. have already been acquired by the 

10 present inventors as described above, while no genes encoding a keto group-introducing enzyme required for the syn- 
thesis of astaxanthin have not successfully obtained. The reason of the failure in obtaining the genes includes that the 
keto group-introducing enzyme is a membrane protein and loses its activity when isolated from the membrane, so that 
it was impossible to purify the enzyme or measure its activity and no information on the enzyme has been obtained. 
Thus, it has hitherto been impossible to produce astaxanthin in microorganisms by gene manipulation. 

16 The object of the present invention is to provide DNA strands vyhich contain genes required for producing keto 
group-containing xanthophylls (ketocarotenoids) such as astaxanthin in microorganisms by obtaining such genes cod- 
ing for enzymes such as a keto group-introducing enzyme required for producing keto group-containing xanthophylls 
(ketocarotenoids) such as astaxanthin, and to provide a process for producing keto group-containing xanthophylls 
(ketocarotenoids) such as astaxanthin with the microorganisms into which the DNA strands have been introduced. 

20 The gene cloning method which is offen used usually comprising purifying the aimed protein, partially determining 
the amino acid sequence and obtaining genes by a synthetic probe cannot be employed because of the purification of 
the astaxanthin synthetic enzyme being impossible as described above. Thus, the present inventors have paid attention 
to the fact that the cluster of carotenoid synthesis genes in non-photosynthetic bacterium (Erwinia) functions in 
Escherichia coli . in which lycopene and p-carotene which are believed to be intermediates for biosynthesis of astaxan- 

25 thin are allowed to produce with combinations of the genes from the gene cluster, and have used Escherichia coli as a 
host for cloning of astaxanthin synthetic genes. The present inventors have also paid atterition to the facts that some 
marine bacteria have an astaxanthin-producing ability (Yokoyama. A., Izumida, H.. Miki, W., "Marine bacteria produced 
astaxanthin". 10th International Symposium on Carotenoids. Abstract. CL11-3. 1993), that a series of related genes 
would constitute a cluster in the case of bacteria, and that the gene cluster would be expressed functionally in 

30 Escherichia coli in the case of bacteria. The present inventors have thus selected the marine bacteria as the gene 
sources. They have carried out researches with a combination of these two means and successfully obtained the gene 
group which is required for the biosynthesis of astaxanthin and the other keto group-containing xanthophylls from 
marine bacteria. They have thus accomplished the present invention. In addition, it has been first elucidated in the 
present invention that the astaxanthin synthesis gene cluster in marine bacteria constitutes a cluster and expresses its 

35 function in Escherichia coli . and these gene products can utilize p-carotene or lycopene as a substrate. 
The DNA strands according to the present invention are set forth as follows. 

(1) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting the methylene group at the 4-position of the p-ionone ring into a keto group. 
40 (2) A DNA Strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting the methylene group at the 4-position of the p-ionone ring into a keto group and having an amino acid 
sequence substantially of amino acid Nos. 1 - 212 which is shown in the SEQ ID NO: 1. 

(3) A DNA strand hybridising the DNA strand described in (2) and having a nucleotide sequence which encodes a 
polypeptide having an enzyme activity described in (2). 
45 (4) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting the methylene group at the 4-position of the p-ionone ring into a keto group and having an amino acid 
sequence substantially of amino acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

(5) A DNA strand hybridizing the DNA strand described in (4) and having a nucleotide sequence which encodes a 
polypeptide having an enzyme activity described in (4). 
so (6) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting p-carotene into canthaxanthin via echinenone and having an amino acid sequence sut^tantially of amino 
acid Nos. 1-212 which is shown in the SEQ ID NO: 1 . 

(7) A DNA strand hybridizing the DNA strand described in (6) and having a nucleotide sequence which encodes a 
polypeptide having an enzyme activity described in (6). 
55 (8) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 

verting p-carotene into canthaxanthin via echinenone and having an amino acid sequence sut>stantially of amino 
acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

(9) A DNA strand hybridizing the DNA strand described in (8) and having a nucleotide sequence which encodes a 
polypeptide having an enzyme activity described in (8). 
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a nucleotide sequence which encodes a polypeptide havin^n 



(10) A DNA strand having a nucleotide sequence which encodes a polypeptide havir?gan enzyme activity for con- 
verting the methylene group at the 4-position of the S-^iydroxy-p-ionone ring Into a keto group. 

(1 1) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting the methylene group at the 4-position of the 3-hydroxy-p-ionone ring into a keto group and having an amino 

£ acid sequence substantially of amino acid Nos. 1 - 212 which is shown in the SEQ ID NO: 1 . 

(12) A DNA strand hybridizing the DNA strand described in (1 1) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (11). 

(13) A DNA strand having a nucleotide sequence which encodes a polypeptide having an* enzyme activity for con- 
verting the methylene group at the 4-position of the 3-hydroxy-p-ionone ring into a keto group and having an amino 

10 acid sequence substantially of amino acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

(14) A DNA strand hybridizing the DNA strand described in (13) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (13). 

(15) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting zeaxanthin Into astaxanthin by way of 4-ketozeaxanthin and having an amino acid sequence substantially 

IS of amino acid Nos. 1-212 which is shown in the SEQ ID NO: 1 . 

(16) A DNA strand hybrJdizing the DNA strand described in (15) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (15). 

(17) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting zeaxanthin into astaxanthin by way of 4-ketozeaxanthln and hairing an amino acid sequence substantially 

20 of amino acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

(18) A DNA strand hybridizing the DNA strand described in (17) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (1 7). . 

(19) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for add- 
ing a hydroxyl group to the 3-carbon of the 4-keto-p-ionone ring. 

25 (20) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for add- 
ing a hydroxyl group to positibn'3-carbon of the 4-keto-p-ionone ring and having an amino acid sequence substan- 
tially of amino acid Nos. 1-162 which is shown in the SEQ ID NO: 2. 

(21) A DNA strand hybridizing the DNA strand described in (20) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (20). 
50 (22) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for add- 
ing a hydroxyl group to position 3-carbon of the 4-keto-p-ionone ring and having an amino acid sequence substan- 
tially of amino acid Nos. 1-162 which is shown in the SEQ ID NO: 6. 

(23) A DNA strand hybridizing the DNA strand described in (22) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (22). 
35 (24) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting canthaxanthin into astaxanthin by way of phoenicoxanthin and having an amino acid sequence substantially 
of amino acid Nos. 1-162 which is shown in the SEQ ID NO: 2. 

(25) A DNA strand hybridizing the DNA strand described in (24) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (24). 
40 (26) A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for con- 
verting canthaxanthin into astaxanthin by way of phoenicoxanthin and having an amino acid sequence substantially 
of amino acid Nos. 1-162 which is shown in the SEQ ID NO: 6. 

(27) A DNA strand hybridizing the DNA strand described in (26) and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity described in (26). 

45 

The present invention also relates to a process for producing xanthophylls. 

That is, the process for producing xanthophylls according to the present invention is set forth below. 

(1) A process for producing a xanthophyll comprising introducing the DNA strand described in any one of the above 
so mentioned DNA strands (1) - (9) into a microorganism having a p-carotene-synthesizing ability, culturing the trans- 
formed microorganism in a culture medium, and obtaining canthaxanthin or echinenone from the cultured cells. 

(2) A process for producing a xanthophyll comprising introducing the DNA strand described in any one of the above 
mentioned DNA strands (10) - (18) into a microorganism having a zeaxanthin-synthesizing ability, culturing the 
transformed microorganism in a culture medium, and obtaining astaxanthin or 4-ketozeaxanthin from the cultured 

55 cells. 

(3) A process for producing a xanthophyll comprising introducing the DNA strand described in any one of the above 
mentioned DNA strands (19) - (27) into a microorganism having a canthaxanthin-synthesizing ability, culturing the 
transformed microorganism in a culture medium, and obtaining astaxanthin or phoenicoxanthin from the cultured 
cells. 
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I xanthophyll according to any one of the above mentioned processes (1 ) - <3). wherein 
the microorganism is a bacterium or yeast. ' . 

Brief Description of the Drawings 

5 

Fig. 1 illustrates diagrammatically the nucleotide sequence of the keto group-introducing enzyme gene ( crt W 
gene) of the marine bacterium Agrobacterlum aurantiacus sp. nov. MKT and the amino acid sequence of a polypeptide 
to be encoded thereby. 

Fig. 2 illustrates diagrammatically the nucleotide sequence of the hydroxyl group-introducing enzyme gene ( crt Z 
10 gene) of the marine bacterium Aorobacterium aurantiacus sp. nov. MKl and the amino acid sequence of a polypeptide 
to be encoded thereby. 

Fig. 3 illustrates diagrammatically the nucleotide sequence of the lycopene-cyclizing enzyme gene ( crt Y gene) of 
the marine bacterium Aorobacterium aurantiacus sp. nov. MK1 and the amino acid sequence of a polypeptide to be 
encoded thereby. 

IS Fig. 4 illustrates diagrammatically the continuation of the sequences following to those illustrated in Fig. 3. 

Fig. 5 illustrates diagrammatically the nucleotide sequence of the xanthophyll synthesis gene cluster of the marine 
bacterium Aorobacterium aurantiacus sp. nov. MKl . 

The letters A - F in Fig. 5 correspond to those in Figs. 1 - 4. 

Fig. 6 illustrates diagramatically the continuation of the sequence following to that illustrated in Fig. 5. 
20 Fig. 7 illustrates diagrammatically the continuation of the sequence following to that illustrated in Fig. 6. 
Fig. 8 illustrates diagrammatically the continuation of the sequence following to that illustrated in Fig. 7. 
Fig. 9 illustrates diagrammatically the continuation of the sequence following to that illustrated in Fig. 8: 
Fig. 10 illustrates diagrammatically the carotenoid biosynthetic route of the non-photosynthesis bacterium Erwinia 
uredovora and the functions of the carotenoid synthetic genes. 
25 Fig. 1 1 illustrates diagramrpatically the main xanthophyll biosynthetic routes of marine bacteria Aorobacterium 
aurantiacus sp. nov. MKl and 'Alcalioenes sp. PC-1 and the functions of the xanthophyll synthesis genes. . 
The function of crtV gene, however, has been confirmed only in the former bacterium. 

Fig. 12 illustrates diagrammatically a variety of deletion plasmids containing the xanthophyll synthesis genes (clus- 
ter) of the marine bacterium Aorobacterium aurantiacus sp. nov. MKl . 
30 The letter P represents the promoter of the lac of the vector.pBluescript II SK. The positions of cutting with restric- 
tion enzymes are represented by abbreviations as follows: Sa. Sad : X. Xba l: B. Bam HI: P, PstI; E, Eco RI: S, Sail; A. 
Apa l: K, Kenl; St. StuI; N. Nrul; Bg. Bglll; Nc. Ncol; He. Hindi. 

Fig. 13 illustrates diagrammatically the nucleotide sequence of the keto group-introducing enzyme gene (crtW 
gene) of the marine bacterium Alcaligenes sp. PC-1 and the amino acid sequence of a polypeptide to be encoded 
35 thereby. 

Fig. 14 illustrates diagrammatically the continuation of the sequences following to those illustrated in Fig. 13. 

Fig. 15 illustrates diagrammatically the nucleotide sequence of the hydroxyl group-introducing enzyme gene (crtZ 
gene) of the marine baderium Alcaligenes sp. PC-1 and the amino acid sequence of a polypeptide to be encoded 
thereby. 

40 Fig. 16 illustrates diagrammatically the nucleotide sequence of the xanthophyll synthetic gene cluster of the marine 
bacterium Alcaligenes sp. PC-1 and the amino acid sequence of a polypeptide to be encoded thereby. The letters A -D 
in Fig. 16 correspond to those in Figs. 13-15. 

Fig. 17 illustrates diagrammatically the continuation of the sequences following to those illustrated in Fig. 16. 
Fig. 18 illustrates diagrammatically the continuation of the sequences following to those illustrated in Fig. 17. 
45 Fig. 1 9 illustrates diagrammatically a variety of deletion plasmids containing the xanthophyll synthetic genes (clus- 
ter) of the marine bacterium Alcaligenes sp. PC-1 . 

The letter P represents the promoter of the lac of the vector pBluescript II SK+, 

Fig. 20 illustrates diagrammatically xanthophyll biosynthetic routes containing miner biosynthetic routes in the 
marine bacteria AQrobacterium aurantiacus sp. no. MKl and Alcaligenes sp. PC-1 and the fundions of the xanthophyll 
so synthesis genes. 

Miner biosynthetic routes are represented by dotted arrows. 

Best Mode for carrying out the Invention 

55 The present invention is intended to provide DNA strands which are useful for synthesizinga keto group-containing 
xanthophylls (ketocarotenoids) such as astaxanthin derived from a marine baderia Aorobacterium aurantiacus sp. nov. 
MKl and Alcalioenes sp. PC-1. and a process for producing keto group-containing xanthophylls (ketocarotenoids), i.e. 
astaxanthin, phoenicoxanthin. 4-ketozeaxanthin. canthaxanthin. and echinenone with use of a microorganism into 
which the DNA strands have been introduced. 
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The DNA strands according to the present invention are in principle illustrated generally by the aforementioned 
DNA strands (1), (10) and (19) from the standpoint of the fine chemical-generating reaction, and basically defined by 
the aforementioned DNA strands (2). (4), (11). (13), (20) and (22). The specific examples of the DNA strands (2) and 
(4) are the aforementioned DNA strands (6) and (8); the specific examples of the DNA strands (11) and (13) are the 
5 aforementioned DNA strands (15) and (17); and the specHic examples of the DNA strands (20) and (22) are the afore- 
mentioned DNA strands (24) and (26). In this connection, the DNA strands (3). (5). (7). (9), (12). (14). (16). (18). (21). ♦ 
(23). (25) and (27) hybridize the DNA strands (2), (4), (6), (8). (11), (13). (15). (17). (20), (22). (24) and (26). respectively, 
under a stringent condition. 

The polypeptides encoded by the DNA strands according to the present invention have amino acid sequences sub- 
10 stantially In a specific range as described above in SEQ ID NOS: 1 - 2, and 5 - 6 (Figs. 1 • 2, and 13-15). e.g. an amino 
acid sequence of amino acid Nos. 1 > 212 In SEQ ID NOS: 1 (A - B in Fig. 1). In the present Invention, four polypeptides 
encoded by these DNA strands, that is four enzymes participating in the xanthophyll-producing reaction) may be mod- 
ified by deletion, substitution or addition in some of the amino acids provided that the polypeptides have the enzyme 

activities as described above (see Example 13). This corresponds to that "amino acid sequences substantially ..." 

IS For instance, an enzyme of which amino acid at the first position (Met) has been deleted is also involved in the polypep- 
tide or enzyme obtained by the modification of the amino acid sequence. In this connection, it is needless to say that 
the DNA strands according to the present invention for encoding the polypeptides also include, in addition to those hav- 
ing nucleotide sequences in a specific range shown in SEQ ID NOS: 1 - 2, and 13-15 (Figs. 1 - 2. and 13-15). degen- 
erate isomers encoding the same polypeptides as above except degenerate codons. 



Keto group-lrrtroducina enzvme oene (crtW) 

The DNA strands (1) - (18) are genes which encode the keto group-introducing enzymes (referred to hereinafter as 
crtW) . Typical examples of the genes are crtW genes cloned from the marine bacteria Agrobacterium aurantiacus sp. 

25 nov. MK1 or Alcalioenes sp. PC-1. which are the DNA strands comprising the nucleotide sequences encoding the 
polypeptides having the amino acid sequences A - B in Fig. 1 (amino acid Nos, 1 - 212 in SEQ ID NOS: 1) or A - B in 
• Figs. 13-14 (amino acid Nos. 1 - 242 in SEQ ID NOS: 5). The crtW gene product (also referred to hereinafter as Crt\A/) 
has an enzyme activity for converting the 4-methylene group of the p-ionone ring into a keto group, and one of the spe- 
cific examples is an enzyme activity for synthesizing canthaxanthin with p-carotene as a substrate by way of 

30 echinenone (see Fig. 1 1 ). In addition, the crtW gene product also has an enzyme activity for converting the 4-methylene 
group of the 3-hydroxy-p-lonone ring into a keto group, and one of the specific examples is an enzyme activity for syn- 
thesizing astaxanthin with zeaxanthin as a substrate by way of 4-ketozeaxanthin (see Fig. 1 1). In this connection, the 
polypeptides having such enzyme activities and the DNA strands encoding the polypeptides have not hitherto been 
reported, and the polypeptides or the DNA strands encoding the polypeptides has no overall homology to polypeptides 

35 or DNA strands which have hitherto been reported. Moreover, no such information has been reported that one enzyme 
has an activity to convert directly a dihydrocarbonyl group of not only the p-ionone ring and the 3-hydroxy- p-ionone ring 
but also the other compounds into a keto group. Moreover, a homology of CrtW as high as 83% identity at an amino 
acid sequence level was shown between Aorobacterium and Alcalioenes . 

On the other hand, it is possible to allow a microorganisms such as Escherichia coli or the like to produce p-caro- 

40 tene or zeaxanthin by using the carotenoid synthesis genes of the non-photosynthetic bacterium Erwinia. that is the 
crtE . crtB. crti and crtY genes of EnA^inia afford the microorganism such as Escherichia coil or the like the p-carotene- 
producing ability, and the crtE . crtB . crtl . crtY and crtZ genes of Erwinia afford the microorganisms such as Escherichia 
coli or the like the zeaxanthin-producing ability (see Fig. 1 0 and Laid-Open Publication of WO91/13078). Thus, the sub- 
strate of CrtW is supplied by the crt gene cluster of Erwinia . so that when additional crtW gene is introduced into the 

45 microorganism such as Escherichia coll or the like which contains the aforementioned crt gene cluster of Erwinia . the 
p-carotene-producing microorganism will produce canthaxanthin by way of echinenone, and the zeaxanthin-producing 
microorganism will produce astaxanthin by way of 4-ketozeaxanthin. 

Hydroxvl oroup-introducina enzvme oene (crtZ) 



The DNA strands (19) - (27) are genes encoding a hydroxyl group-introducing enzyme (referred to hereinafter as 
crtZ) . Typical examples of the genes are cdZ genes cloned from the marine bacteria Agrobgcterium aurantiacus sp. 
nov. MK1 or Alcalioenes sp. PC-1. which are the DNA strands comprising the nucleotide sequences encoding the 
polypeptides having the amino acid sequences C - D in Fig. 2 (amino acid Nos. 1 - 162 In SEQ ID NOS: 2) or C - D in 
ss Figs. 15 (amino acid Nos. 1 - 162 in SEQ ID NOS: 6). The crtZ gene product (also referred to hereinafter as CrtZ) has 
an enzyme activity for adding a hydroxyl group to the 3-carbon atom of the p-ionone ring, and one of the specific exam- 
ples is an enzyme activity for synthesizing zeaxanthin with use of p-carotene as a substrate by way of p-cryptoxanthin 
(see Fig. 1 1). In addition, the crtZ gene product also has an enzyme activity for adding a hydroxyl group to the 3-carbon 
atom of the 4-keto- p-ionone ring, and one of the specific examples is an enzyme activity for synthesizing astaxanthin 
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with canthaxanthin as a suBWate by way oi phoenicoxanthin (see Fig. 11). In this connection, the polypeptide having 
the latter enzyme activity and the DNA strand encoding the polypeptide have not hitherto been reported. Moreover, 
CrtZ of Aorobacterium and Alcaligenes showed a high homology with CrtZ of Enwinia uredovora (57% and 58% iden- 
tity), respectively, at an amino acid sequence level. Also, a high homology of 90% identity at an amino acid sequence 
5 level was shown between the CrtZ of Aorobacterium and Alcaligenes . ♦ 
It has been described above that it is possible to allow a microorganism such as Escherichia coli or the like to pro- 
duce p-carotene by using the carotenoid synthetic genes of the non-photosynthetic bacterium Erwinia . Moreover, It has 
been described above that it is possible to allow a microorganism such as Escherichia coli or the like to produce can- 
thaxanthin by adding crtW thereto. Thus, the substrate of CrtZ of Agrobacterium or Alcaligenes is supplied by the crtE . 
10 crtB . crti and crtY genes of Enwinia (production of p-carotene), and the crtW gene of Aorobacterium or Alcalioenes 
added thereto, so that when the crtZ gene of Aorobacterium or Alcaligenes is introduced into a microorganism such as 
Escherichia coli or the like containing the crt gene group, the p-carotene-producing microorganism will produce zeax- 
anthin by way of p-cryptoxanthin. and the canthaxanthin-producing microorganism will produce astaxanthin by way of 
phoenicoxanthin. 



Lycopene-cyclizing enzyme gene (crtY) 

The DNA strand encoding the amino acid sequence substantially from E to F of Figs. 3 and 4 (amino acid Nos. 1- 
386 in SEQ ID NO: 3) is a gene encoding a lycopene-cyctizing enzyme (referred to hereinafter as crtY) . A typical exam- 

20 pie of the 9ene Is the crtY gene cloned from the marine bacterium Agrobacterium aurantlacus sp. nov; MK1, which is 
the DNA strand comprising the nucleotide sequence encoding the polypeptide having the amino acid sequence E - F 
in Figs. 3 and 4. The crtY gene product (also referred to hereinafter as CrtY) has an enzyme activity for synthesizing p- 
carotene with tycopene as a substrate (see Fig. 11). It is possible to allow a microorganism such as Escherichia coli or 
the like to produce lycopene by using a carotenoid biosynthesis genes of a non-photosynthetic bacterium Erwinia . that 

25 is the crtE . crtB and crtl genes of Erwinia give a microorganism such as Escherichia coll or the like a lycopene biosyn- 
thesis ability (see Fig. 10. and Laid-Open Publication of WO91/13078).. Thus, the substrate of the CrtY of Aorobacte- 
rium is supplied by the cr\ gene group of Erwinia . so that when the crtY of Aorobacterium is Introduced into a 
microorganism such as Escherichia coli or the like containing the crt gene group, it is possible to allow the microorgan- 
ism to produce p-carotene. 

30 In this connection, the CrtY of Agrobacterium has a significant homology of 44.3% identity to the CrtY of Erwinia 
uredovora at the amino acid sequence level, and these CrtY enzymes also have the same enzymatic function (see Figs. 
10 and 11). 

Bacteriological properties of marine bacteria 



The marine bacteria Aorobacterium aurantiacus sp. nov. MK1 and Alcaligenes sp. PC-1 as the sources of the xan- 
thophyll synthetic genes show the following bacteriological properties. 

(Aorobacterium aurantiacus sp. nov. MK1 > 

40 ^ 

(1) Morphology 

Form and size of bacterium: rod. 0.9 ^m x 1 .2 jim; 
Motility: yes; 

Flagellum: peripheric f lagellum; 
45 Polymorphism of cell: none; 
Sporogenesis: none; 
Gram staining: negative. 

(2) Growths in culture media 

so Broth agar plate culture: non-diffusive circular orange colonies having a gloss are formed. 
Broth agar slant culture: a non-diffusive orange band having a gloss is formed. 
Broth liquid culture: homogeneous growth all over the culture medium with a color in orange. 
Broth gelatin stab culture: growth over the surface around the stab pore. 

55 (3) Physiological properties 
Reduction of nitrate: positive: 
Denttriflcation reaction: negative; 
Formation of Indole: negative; 
Utilization of citric acid: negative; 
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Formation of pigments: fat-soluble reddish orange pigment; 

Urease activity: negative; 

Oxidase activity: positive; 

Catalase activity: positive; 
5 p-Glucosidase activity (esculin degradability): positive; 

p-Galactosidase activity: positive; 

Growth range: pH, 5 - 9; temperature, 10 - 40°C: 

Behavior towards oxygen: aerobic; 

Durability to seawater: positive; 
10 O - F test: oxidation; 

• Anabolic ability of saccharides: 

Positive: D-glucose, D-mannose, D-galaclose, D-fructose, lactose, maltose, sucrose, glycogen, N-acetyl-D-glu- 

cosamine; 

Negative: L-arabinose. D-mannitol, inositol, L-rhamnos^. D-sorbitol; 
75 Anabolic ability of organic acids: 
Positive: lactate; 

Negative: citrate, malate, gluconate, caprinate. succinate, adipate; 
Anabolic ability of the other organic materials: 

Positive: inosine, uridine, glucose-1 -phosphate, glucose-6-phosphate*; 
20 Negative: gelatin, L-arginine, DNA, casein. 

(Alcaligenes sp. PC-1) 

(1) Morphology 

25 Form and size of bacterium: short rod, 1.4 pm; 

Motility: yes; ' 

Flagellum: peripheric flagellum; 

Polymorphism of cell: none; 

Sporogenesis: none; 
30 Gram staining: negative. 

(2) Growths in culture media 

Broth agar plate culture: non>diffusive circular orange colonies having a gloss are formed. 
Broth agar slant culture: a non-diffusive orange band having a gloss is formed. 
35 Broth liquid culture: homogeneous growth all over the culture medium with a color in orange. 
Broth gelatin stab culture: growth over the surface around the stab pore. 

(3) Physiological properties' 

Formation of pigments: fat-soluble reddish orange pigment; 
40 Oxidase activity: positive; 

Catalase activity: positive; 

Growth range: pH. 5-9; temperature, 10 - 40°C: 

Behavior towards oxygen: aerobic; 

Durability to seawater: positive; 
45 O'F test: oxidation; 

Degradability of gelatin: negative. 

Xanthophyll synthetic gene cluster of the other marine bacteria 

50 It has hitherto been reported that 16 marine bacteria have an ability to synthesize ketocarotenoids such as astax- 

anthin and the like (Yokoyama, A.. Izumida, H., Miki, W., "Marine bacteria produced astaxanthin". 10th International 
Symposium on Carotenoids, Abstract, CLi 1-3, 1993). If either of the cd genes of the aforementioned marine bacteria 
Agrobacterium aurantiacus sp. nov MK-1 or Alcalioenes sp. PC-1 is used as a probe, the gene cluster playing a role of 
the biosynthesis of ketocarotenoids such as astaxanthin and the like should be obtained from the other astaxanthin pro- 

55 ducing marine bacteria by using the homology of the genes. In fact, the present inventors have successfully obtained 
the crtW and crtZ genes as the strongly hybridizing DNA fragments from the chromosomal DNA of Alcalioenes PC-1 
with use of a DNA fragment containing crtW and crtZ of Ag. aurantiacus sp. nov. MK1 as a probe (see Examples as for 
the details). Furthermore, when Alteromonas SD-402 was selected from the remaining 14 marine bacteria having an 
astaxanthin synthetic ability and a chromosomal DNA was prepared therewith and subjected to the Southern hybridi- 
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igment containing crtW and crtZ of Aq. aurantiacus sp, nov. MK1 . the probe hybridized 
with the bands derived from the chromosomal DNA of 'the rnarine bacteria. The DNA strands according to the present 
invention also includes a DNA strand which hybridizes with the DNA strands (2). (4). (6). (8). (11), (13), (15). (17), (20). 
(22), (24) and (26). 

5 

Acquisition of DNA strands 

Although one of the methods for obtaining the DNA strand having a nucleotide sequence which encodes the amino 
acid sequence of each enzyme described above is to chemically synthesize at least a part of the strand length accord- 
10 ing to the method for synthesizing a nucleic acid, it is believed more preferable than the chemical synthetic method to 
obtain the DNA strand by using the total DNA having been digested with an appropriate restriction enzyme to prepare 
a library in Escherichia coll . from which library the DNA strand is obtained by the methods conventionally used in the 
art of genetic engineering such as a hybridization method with an appropriate probe (see the xanthophyll synthetic 
gene cluster of the other marine bacteria). 

75 

Transformation of an microor panism such as Escherichia coli and gene expression 

A variety of xanthophylls can be prepared by introducing th^ present PNA strands described above into appropriate 
microorganisms such as bacteria, for example Escherichia coli. Zymbmonas mobilis and Aorobacterium tumefaciens . 
20 and yeasts, for example Saccharomyces cerivisiae . 

The outline for introducing an foreign gene into a preferred microorganism is described below. 
The procedure or method for introducing and expressing the foreign gene in a microorganism such as Escherichia 
CQli or the like comprises the ones usually used in the art of genetic engineering in addition to those described below in 
the present invention and may be carried out according to the procedure or method (see. e.g., "Vectors for Cloning 
25 Genes", Methods in Enzymology, 216, p. 469-631, 1992, Academic Press, and "Other Bacterial Systems", Methods in 
Enzymology. 204. p. 305-636,* 1991 , Academic Press)^ 

<Escherichia coli ) 

30 The method for introducing foreign genes into Escherichia coli includes several efficient methods such as the Hana- 
han's method and the rubidium method, and the foreign genes may be introduced according to these methods (see, for 
example, Sambrook, J., Fritsch, E.F., Maniatis, T, "Molecular Cloning - A Laboratory Manual", Cold Spring Harbor Lab- 
oratory Press, 1989). While foreign genes in Escherichia coli may be expressed according to the conventional methods 
(see, for example, "Molecular Cloning - A Laboratory Manual"), the expression can be carried out for example with a 

35 vector for Escherichia coli having a lac promoter in the pUC or pBluescript series. The present inventors have used a 
vector pBluescrip II SK or KS for Escherichia coli having a lac promoter and the like to insert the crtW. crtZ and crtY 
genes of Aqrobacteriurn aurantiacus sp. nov. MK1 and the crtW and crtZ genes of Alcalioenes sp. PC-1 and allowed to 
express these genes in Escherichia coli . 

40 (Yeast) 

The method for introducing foreign genes into yeast Saccharomyces cerivisiae includes the methods which have 
already been established such as the lithium method and the like, and the introduction may be carried out according to 
these methods (see, for example, Ed. Yuichi Akiyama, compiled by Bio-industry Association, "New Biotechnology of 

45 Yeast", published by IGAKU SHUPPAN CENTER). Foreign genes can be expressed in yeast by using a promoter and 
a terminator such as PGK and GPD to construct an expression cassette in which the foreign gene is inserted between 
the promoter and the terminator so that transcription Is led through, and inserting the expression cassette Into a vector 
such as the YRp system which is a multi-copy vector for yeast having the ARS sequence of the yeast chromosome as 
the replication origin, the YEp system which is a multi-copy vector for yeast having the replication origin of the 2 pm 

so DNA of yeast, and the Yip system which is a vector for integrating a yeast chromosome having no replication origin of 
yeast (see "New Biotechnology of Yeasf . published by IGAKU SHUPPAN CENTER, ibid.; NIPPON NOGEI-KAGAKU 
KAI ABC Series "Genetic Engineering for Producing Materials", published by ASAKURA SHOTEN; and Yamano. S.. 
Ishii, T, Nakagawa, M., Ikenaga, H., Misawa. N.. "Metabolic Engineering for Production of p-carotene and lycopene in 
Saccharomyces cerevisiae". Biosci. Biotech. Biochem., 58. p. 1112-1114. 1994). 

55 

<Zymomonas mobilis ) 

Foreign genes can be introduced into an ethanol-producing bacterium Zymomonas mobilis by the conjugal transfer 
method which is common to Gram-negative bacteria, and the foreign genes can be expressed by using a vector pZA22 
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tor Zvmomonas mobilis (see Katsumi Nakamura, "Molecular Breeding o1 Zvmomonas moBrHS ". Nippon Nogei-Kagaku 
Kaishi, 63, p. 1016-1018. 1989; and Misawa. N.. Yamano. S., Ikanaga. H.. "Production of p-Carotene in Zymomonas 
mobHis and Aarobaderium tumefaciens by Introduction of the Biosynthesis Genes from Erwinia uredovor^ ". Appl. Envi- 
ron. Microbiol., 57. p.1847-1849, 1991). 



Foreign genes can be introduced into a plant pathogenic bacterium Agrobacterium tumefaciens by the conjugal 
transfer method which is common to Gram-negative bacteria, and the foreign genes can be expressed by using a vector 
70 pBli2l for a bacterium such as Agrobacterium tumefaciens (see Misawa, N.. Yamano, S.. Ikenaga, H,, "Production of 
p-Carotene in Zymomonas mobilis and Agrobacterium tumefaciens by Introduction of the Biosynthesis Genes from 
Erwinia uredovora ". Appl. Environ. Microbiol.. 57, p. 1847-1849, 1991). 

Production o f xanthoohylls by microorganisms 



The gene cluster for the synthesis of ketocarotenoids such as astaxanthin derived from a marine bacterium can be 
introduced and expressed by the procedure or method described above for introducing and expressing an foreign gene 
in a microorganism. 

Farnesyl pyrophosphate (FPP) is a substrate which is common not only to carotenoids but also to other terpenoids 

20 such as sesquiterpenes, triterpenes. sterols, hopanols and the like. In general, microorganisms synthesize terpenoids 
even if they cannot synthesize carotenoids, so that all of the microorganisms should basically have FPP as an interme- 
diate metabolite. Furthermore, the carotenoid synthesis gene cluster of a non-photosynthetic bacterium Erwinia has an 
ability to synthesize the substrates of the cri gene products of Agrobacterium aurantiacus sp. nov. MK1 or Alcaliqenes 
sp. PC-1 by using FPP as a substrate (see Fig. 10). The present inventors have already confirmed that when the group 

2b of crt genes of Erwinia is introduced into not only Escherichia coli but also the aforementioned microorganisms, that is 
the yeast Saccharomyces cerevisiae . the ethanol producing bacterium Zvmomonas mobilis. or the plant pathogenic 
. bacterium Agrobacterium tumefaciens . carotenoids such as p-carotene and the like can be produced, as was expected, 
by these microorganisms (Yamano, S.. Ishii, T, Nakagawa, M., Ikenaga, H., Misawa. N.. "Metabolic Engineering for Pro- 
duction of p-Carotene and Lycopene in Saccharomvces cerevisiae ". Biosci. Biotech. Biochem.. 58, p. 1 1 12-1 1 14. 1994; 

30 Misawa. N., Yamano, S.. Ikenaga, H.. "Production of p-Carotene in Zymomonas mobilis and Agrobacterium tumefa- 
ciens by Introduction of the Biosynthetic Genes from Erwinia uredovora ". Appl. Environ. Microbiol.. 57. p. 1847-1849, 
1991 ; and Japanese Patent Application No. 58786/1991 (Japanese Patent Application No. 53255/1990) by the present 
inventors: "DNA Strands useful for the Synthesis of Carotenoids"). 

Thus, it should be possible in principle to allow alt of the microorganisms, in which the gene introduction and 

35 expression system has been established, to produce ketocarotenoids such as astaxanthin and the like by introducing 
the combination of the carotenoid synthesis gene cluster derived from Erwinia and the DNA strands according to the 
present invention (typically the carotenoid synthesis gene cluster derived from Agrobacterium aurantiacus sp. nov. MK1 
or Alcalioenes sp. PC-1) at the same time into the same microorganism. The process for producing a variety of keto- 
carotenoids in microorganisms are described below. 



{Production of canthaxanthin and echinenone) 

It is possible to produce canthaxanthin as a final product and echinenone as an intermediate metabolite by intro- 
ducing into a microorganism such as Escherichia coli and expressing the crtE . crtB . crti and crtY genes of Erwinia ure- 

AS dovora required for the synthesis of p-carotene and any one of the DNA strands of the present invention (1) - (9) which 
is a keto group-introducing enzyme gene (typically, the crtW gene of Agrobacterium aurantiacus sp. nov. MKI or Alca- 
liqenes PC-1). The yields or the ratio of canthaxanthin and echinenone can be changed by controlling the expression 
level of the DNA strand (crtW gene) or examining the culturing conditions of a microorganism having the DNA strand. 
Two embodiments in Escherichia coli are described below, and more details will be illustrated in Examples. 

50 A plasmid pACCAR16AcrtX that a fragment containing the crtE, crtB, crtI and crtY genes of Enwinia uredovora has 

been inserted into the Escherichia coli vector pACYCl84 and a plasmid pAK9l6 that a fragment containing the crtW 
gene of Agrobacterium aurantiacus sp. nov. MKI has been inserted into the Escherichia coli vector pBluescript II SK- 
were introduced into Escherichia coli JMlOl and cultured to the stationary phase to collect bacterial cells and to extract 
carotenoid pigments. The extracted pigments comprised 94% of canthaxanthin and 6% of echinenone. Also, canthax- 

£5 anthin was obtained in a yield of 3 mg starting from 2 liters of the culture solution. 

A plasmid pACCAR16AcrtX that a fragment containing the crtE. crtB. crtI and crtY genes of Erwinia uredovora has 
been inserted into the Escherichia coli vector pACYCl84 and a plasmid pPCl7-3 that a fragment containing the crtW 
gene of Alcaliqenes PC-1 has been inserted into the Escherichia coli vector pBluescript II SK+ were introduced into 
Escherichia coli JM101 and cultured to the stationary phase to collect bacterial cells and to extract carotenoid pigments. 



(Agrobacterium tumefaciens ) 
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The extracted pigments comprised 40% of canthaxanthin apd 50% o1 echinenone. The remainder comprised 10% of 
unreacted p-carotene. 

<Production of astaxanthin and 4-ketozeaxanthin) 

5 

It is possible to produce astaxanthin as a final product and 4-keto2eaxanthin as an intermediate metabolite by intro- 
ducing into a microorganism such as Escherichia coli or the like and expressing the crtE . crtB. crti, crtY and crtZ genes 
of Erwinia uredovora required for the synthesis of zeaxanthin and any one of the DNA strands of the present invention 
(10) - (18) which is a keto group- introducing enzyme gene (typically, the crtW gene of Aarobacterium aurantiacus sp. 
10 nov. MK1 or Alcaliaenes PC-1 ). The yields or the ratio of astaxanthin and 4-ket02eoxanthin can be changed by control- 
ling the expression level of the DNA strand (crtW gene) or examining the culturing conditions of a microorganism having 
the DNA strand. 

Two embodiments in Escherichia coli are described below, and more details will be illustrated in Examples. 

A plasmid pACCAR25AcrtX that a fragment containing-the crtE . crtB. crtl . crtY and crtZ genes of Enwinia uredovora 

7£ has been inserted into the Escherichia coli vector pACYCl84 and a plasmid pAK9l6 that a fragment containing the 
crtW gene of Ag. aurantiacus sp. nov. MK1 has been inserted into the Escherichia coli vector pBluescript It SK- were 
introduced into Escherichia coli JM101 and cultured to the stationary phase to collect bacterial cells and to extract car- 
otenoid pigments. The yield of the extracted pigments was 1 .7 rpg of astaxanthin and 1 .5 mg of 4-ketozeaxanthin bas^ 
on 2 liters of the culture solution. 

20 A plasmid pACCAR25AcrtX that a fragment containing the crtE . crtB. crtl . crtY and crtZ genes of Erwinia uredovora 
has been Inserted into the Escherichia coli vector pACYCi84 and a plasmid pPCl7-3 that a fragment containing. the 
crtW gene of Alcaliqenes PC-1 has been inserted into the Escherichia coli vector pBluescript II SK+ were introduced 
into Escherichia coli JM101 and cultured to the stationary phase to collect bacterial cells and to extract carotenoid pig- 
ments. The yield of the extracted pigments was about 1 mg of astaxanthin and 4-ketozeaxanthin, respectively based on 

25 2 liters of the culture solution., , . 

■. * • 

{Production of astaxanthin and phoenicoxanthin) 

It is possible to produce astaxanthin as a final product and phoenicoxanthin as an intermediate metabolite by intro- 
30 ducing into a microorganism such as Escherichia coli or the like and expressing the crtE . crtB. crtl and crtY genes of 
Erwinia uredovora required for the synthesis of p-carotene, any one of the DNA strands of the present invention (1) - (9) 
which is a keto group-introducing enzyme gene (typically, the crtW gene of Aqrobacterium aurantiacus sp. nov. MK1 or 
Alcaliaenes PC-1). and any one of the DNA strands of the present invention (19) - (27) which is a hydroxyl group-intro- 
ducing enzyme gene (typically, the crtZ gene of Ag. aurantiacus sp. nov. MKi or Alcaliqenes PC-1). The yields or the 
35 ratio of astaxanthin and phoenicoxanthin can be changed by controlling the expression level of the DNA strands ( crtW 
and crtZ genes) or examining the culturing conditions of a microorganism having the DNA strands. An embodiment in 
Escherichia coli are described below, and more details will be illustrated in Examples. 

A plasmid pACCAR16AcrtX that a fragment containing the crtE . crtB . crtl and crtY genes of Erwinia uredovora has 
been inserted into the Escherichia coli vector pACYCl84 and a plasmid pAK96K that a fragment containing the crtW 
40 and crtZ genes of Ag. aurantiacus sp. nov. MKI has been inserted into the Escherichia coli vector pBluescript II SK- 
were introduced into Escherichia coli JMlOl and cultured to the stationary phase to collect bacterial cells and to extract 
carotenoid pigments. The yield of the extracted pigments comprised was 3 mg of astaxanthin and 2 mg of phoenicox- 
anthin starting from 4 liters of the culture solution. 



45 Deposition of microoroanisms 

Microorganisms as the gene sources of the DNA strands of the present invention and Escherichia coli carrying the 
isolated genes (the DNA strands of the present invention) have been deposited to National Institute of Bioscience and 
Human Technology. Agency of Industrial Science and Technology. 

50 

(i) Aarobacterium aurantiacus sp. nov. MKI 
Deposition No: PERM BP-4506 
Entrusted Date: December 20. 1993 

(ii) Escherichia coli JMIOI (pAccrt-EIB. pAK92) 
55 Deposition No: PERM BP-4505 

Entrusted Date: December 20, 1993 

(iii) Alcaiicense sp. PC-1 
Deposition No: PERM BP-4760 
Entrusted Date: July 27, 1994 



11 



L'f ^ : 5 137 i^o 

(iv) Escherichia coli p: pPCl7 
Deposition No: PERM BP-4761 

Entrusted Date: July 27, 1994 , 
5 Examples 

The present invention is further described more specifically with reference to the following examples without restric- 
tion of the invention. In addition, the ordinary experiments of gene manipulation employed herein is based on the stand- 
ard methods (Sambrook. J.. Fritsch. E.F., Maniatis, T. "Molecular Cloning - A Laboratory Manual". Cold Spring Harbor 
70 Laboratory Press. 1989), unless otherwise specified. 

Example 1 : Preparation of chromosomal DNA 

Chromosomal DMAs were prepared from three marine bacterial strains, i.e. Agrobacterium aurantiacus sp. nov. 

75 MKT , Alcaliaenes sp. PC-1 . and Alteromonas SD-402 (Yokoyama, A., Izumida, H.. Miki, W., "Marine bacteria produced 
astaxanthin", I0th International Symposium on Carotenoids. Abstract. CL1 1-3. 1993). After each of these marine bac- 
teria was grown in 200 ml of a culture medium (a culture medium prepared according to the instruction of "Marine Broth" 
manufactured by DIFCO) at 25''C for 4 days to the stationary phase, the bacterial cells were collected, washed with a 
TES buffer (20 mM Tris. 10 mM EDTA, 0.1 M NaCI. pH 8). subjected to heat treatment at 68^C for 15 minutes, and sus- 

20 pended into the solution I (50 mM glucose, 25 mM Tris, 10 mM EDTA, pH 8) containing 5 mg/ml of lysozyme (manufac- 
tured by SE IKAG AKU KOGYO) and 1 00 M9/nril of RNase A (manufactured by Sigma). After incubation of the suspension 
at 37''C for 1 hour, Proteinase K (manufactured by Boehrlnger-Mannheim) was added and the mixture was incubated 
at 37°C for 10 minutes. After SARCOSIL (N-lauroylsarcosine Na, manufactured by Sigma) was then added at the final 
concentration of 1% and the mixture was sufficiently mixed, it was incubated at 37°C for several hours. The mixture was 

26 extracted several times with phenol/chloroform, and ethanol in a two-time amount was added slowly. Chromosomal 
DNA thus deposited was wound around a glass rod, rinsed with 70% ethanol and dissolved in 2 ml of a TE buffer (10 
. mM Tris. 1 mM EDTA, pH 8) to prepare a chromosomal DNA solution. 

Example 2: Preparation of hosts for a cosmid library 

30 

( 1 ) Preparation of phytoene-producing Escherichia coli 

After the removal of the B§1EII (1235) - Eco 52l (4926) fragment from a plasmid pCARl6 having a carotenoid syn- 
thesis gene cluster except the crtZ gene of Erwinia uredovora (Misawa. N.,Nakagawa. M.. Kobayashi. K., Yamano. S.. 

35 Izawa, Y, Nakamura. K., Harashima, K.. "Elucidation of the Envinia uredovora Carotenoid Biosynthetic Pathway by 
Functional Analysis of Gene Porducts expressed in Escherichia coli ". J. BacterioL, 172. p. 6704-6712, 1990; and Jap- 
anese Patent Application No. 58786/1991 (Japanese Patent Application No. 53255/1990): "DNA Strands useful for the 
Synthesis of Carotenoids"), a 2,3 kb Asp7l8 ( Kpn l) - EcoRI fragment containing the crtE and crtB genes required for 
the production of phytoenes was cut out. This fragment was then inserted into the Eco RV site of the E. coli vector 

40 pACYCl84 to give an aimed plasmid (pACCRT-EB). The bacterium E. coli containing pACCRT-EB exhibits resistance 
to an antibiotic chloramphenicol (Cm*^) and produces phytoenes (Linden, H.. Misawa. N., Chamovitz. D., Pecker. I.. Hir- 
schberg, J., Sandmann, G.. "Functional Complementation in Escherichia coli of Different Phytoene Desaturase Genes 
and Analysis of Accumulated Carotenes", Z. Naturforsch.. 46c, 1045-1051, 1991). 

45 (2) Preparation of lycopene-producing Escherichia coli 

After the removal of the BstEII (1235) - SnaBI (3497) fragment from a plasmid pCARl6 having a carotenoid syn- 
thesis gene cluster except the crtZ gene of Erwinia uredovora . a 3.75 kb Asp 7l8 ( Kpn l) - Eco RI fragment containing 
the crtE, crt! and crtB genes required for the production of lycopene was cut out. This fragment was then inserted into 
so the EcoRV site of the E. coli vector pACYCl84 to give an aimed plasmid (pACCRTEIB). The bacterium E. coli contain- 
ing pACCRT-EIB exhibits Cm^ and produces lycopene (Cunningham Jr. F.X.. Chamovitz, D., Misawa, N.. Gatt. E., Hir- 
schberg, J.. "Cloning and Functional Expression in Escherichia coli of Cyanobacterial Gene for Lycopene Cyclase, the 
Enzyme that catalyzes the Biosynthesis of p-Carotenes", FEBS Lett.. 328. 130-138. 1993). 

55 (3) Preparation of p-carotene-producing Escherichia coli 

After the c rt X gene was inactivated by subjecting a plasmid pCARiB having a carotenoid synthesis gene cluster 
except the crtZ gene of Erwinia uredovora to digestion with restriction enzyme BstEII, the Klenow fragment treatment 
and the ligation reaction, a 6.0 kb Asp 7l8 ( Kpn l) - Eco RI fragment containing crtE . crtY. and crtB genes required 
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for the production o1 p-caro1Sne was cut out. This fragment was then inserted into the EcoRV site of the E. coli vector 
pACYCl84 to give an aimed plasmid (referred to hereinafter as pACCARl6AcrtX). The bacterium E. coli containing 
pACCARl6AcrtX exhibits Cm' and produces p-carotene. In this connection, the restriction enzyme ahd enzymes used 
for genetic manipulation have been purchased from TAKARA SHUZO (K.K.) or Boehringer-Mannheim. 

Example 3: Preoaration of a cosmid library and acquisition of Escherichia coli which exhibits orange in color 

After the restriction enzyme Sau SAI was added in an amount of one unit to 25 pg of the chromosomal DNA of Aoro- 
bacterium aurantiacus sp. nov. MK1 . the mixture was incubated at ST^'C for 1 5 niinutes and heat treated at 68**C for 10 

10 . minutes to inactivate the restriction enzyme. Under the condition, many partially digested fragments with Sau 3AI were 
obtained at about 40 kb. The cosmid vector pJBB (resistant to ampi.cillin (Ap^)) which had been subjected to Bam HI 
digestion and alkaline phosphatase treatment and the right arm (shorter fragment) of pJBB which had beeh digested 
with Sail /Bam HI and then recovered from the gel were mixed with a part of the above Sau 3AI partial fragments, and 
ligated at 12°C overnight. In this connection. pJBB has been purchased from Amersham. 

IS Phage particles were obtained in an amount sufficient for preparing a cosmid library by the in vitro packaging with 
a Gigapack Gold (manufactured by Stratagene; available from Funakoshi) using the DNA above ligated. 

After Escherichia coli DH1 (ATCC33849) and Escherichia coli DH1, each of which has one of the three plasmids 
prepared in Example 2, were Infected with the phage particles, these bacteria were.dilute<j so that 100 - 300 colonies 
were found on a plate, plated on LB containing appropriate antibiotics (1% trypton, 0.5% yeast extract, 1% NaCI), and 

20 cultured at S^^C or room temperature for a period of overnight to several days. 

As a result, in cosmid libraries having the simple Escherichia coli (beige) or the phytoene-producing Escherichia 
coli (beige) with pACCRT-EB as a host, no colonies with changed color were obtained notwithstanding the screening of 
a ten thousand or more of the colonies for respective libraries. On the other hand, in cosmid libraries having the lyco- 
pene-producing Escherichia coli (light red) with pACCRT-EIB or the p-carotene-producing Escherichia coli (yellow) with 

25 pACCARl6AcrtX as a host, colonies exhibiting orange have appeared in a proportion of one strain to several hundred 
colonies, respectively Most of these transformed Escherichia coli strains which exhibits orange contained plasmid pJB8 
in which about 40 kb partially digested Sau SAI fragments were cloned. It is also understood from the fact that no colo- 
nies with changed color appeared in cosmid libraries having the simple Escherichia coli or the phytoene-producing 
Escherichia coli with pACCRT-EB as a host, that Escherichia coli having an ability of producing a carotenoid synthetic 

30 intermediate of the later steps of at least phytoene should be used as a host for the purpose of expression-cloning the 
xanthophyll synthesis gene cluster from the chromosomal DNA of Agrobacterium aurantiacus sp. nov. MK1 . 

Example 4: Localization of a fragment containing an oranoe pigment synthesis oene cluster 

35 When individual several ten colonies out of the orange colonies obtained in cosmid libraries having the lycopene- 
producing Escherichia coli (light red) with pACCRT-EIB or the p-carotene-producing Escherichia coli (yellow) with 
pACCARl6AcrtX as a host were selected to analyze the plasmids, 33 kb - 47 kb fragments partially digested with 
Sau3AI were inserted in vector pJB8 in all of the colonies except one strain. The remaining one strain (lycopene-pro- 
ducing Escherichia coli as a host) contains a plasmid, in which a 3.9 kb fragment partially digested with Sau 3Al was 

40 inserted in pJB8 (referred to hereinafter as plasmid pAK9). This was considered to be the one formed by the in vivo 
deletion of the inserted fragment after the infection to Escherichia coli . The same pigment (identified as astaxanthin in 
Example 6) as that in the orange colonies obtained from the other cosrnid libraries was successfully synthesized with 
the lycopene-producing Escherichia coli having pAK9. pAK9 was used as a material in the following analyses. 

45 Example 5: Determination of the nucleotide sequence in the orange pigment synthesis oene cluster 

A 3.9 kb Eco RI inserted fragment prepared from pAK9 was inserted into the EcoRI site of the Escherichia coli vec- 
tor pBluescrip II SK+ to give two plasmids (pAK9l and pAK92) with the opposite directions of the fragment to the vector. 
The restriction enzyme map of one of the plasmids (pAK92) is illustrated in Fig. 12. When pAK92 was introduced into 

so the lycopene-producing Escherichia coli orange colonies were obtained as a result of the synthesis of astaxanthin 
(Example 6). However, no ability for synthesizing new pigments was afforded even if pAK9l was introduced into the lyc- 
opene-producing Escherichia coli . It was thus considered that the pigment synthesis gene cluster in the plasmid pAK92 
has the same direction as that of the lac promoter of the vector. Next, each of a 2.7 kb PstI fragment obtained by the 
PstI digestion of pAK91 , a 2.9 kb Bam HI fragment obtained by the Bam HI digestion of pAK92. and 2.3 kb and 1.6 kb 

55 Sail fragments obtained by the Sail digestion of pAK92 was cloned into the vector pBluescrip II SK-. The restriction 
maps of plasmids referred to as pAK94. pAK96. pAK98. pAK910. pAK93, and pAK95 are illustrated in Fig. 12. The plas- 
mids pAK94. pAK96, pAK98 and pAK9lO have the pigment synthesis gene cluster in the same direction as that of the 
lac promoter of the vector, while the plasmids pAK93 and pAK95 have the pigment synthesis gene cluster in the oppo- 
site direction to that of the promoter. 
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It was found that when the plasmid pAK96 having a 2.9 kb Bam HI fragment was introauced into the lycopene-pro- 
ducing Escherichia colL the transformant also synthesized astaicanthin as in the case when the plasmid pAK92 having 
a 3.9 kb EcoRI fragment was introduced (Example 6). so that the DNA sequence of the 2.9 kb Bam HI fragment was 
determined. 

5 The DNA sequence was determined by preparing deletion mutants of the 2.9 kb Bam HI fragment from the normal 

and opposite directions and determining the sequence using clones having various lengths of deletions. The deletion 
mutants were prepared from the four plasmids pAK96. pAK98. pAK93 and pAK95 according to the following procedure: 
Each of the plasmids, 10 pg. was decomposed with Sad and Xba l and extracted with phenbl/chloroform to recover 
DNA by ethanol precipitation. Each of DNA was dissolved in 100 m' of ExoHl buffer (50 mM Tris-HCI. 100 mM NaCI. 5 
w mM MgClp, 10 mM 2-mercaptoethanol. pH 8.0). 180 units of Exolll nuclease was added, and the mixture was main- 
*tained at 37°C. A 10 pi portion was sampled at every 1 minute, and two samples were transferred into a tube in which 
20 jxl of MB buffer (40 mM sodium acetate, 100 mM NaCI. 2 mM ZnClg, 10% glycerol, pH 4.5) is contained and which 
is placed on Ice. After completion of the sampling, five tubes thus obtained were maintained at S5'*C for 10 minutes to 
inactivate the enzyme, five units of mung bean nuclease were added, and the mixtures were maintained at 37°C for 30 
75 minutes. After the reaction, five DNA fragments different from each other in the degrees of deletion were recovered for 
each plasmid by agarose gel electrophoresis. The DNA fragments thus recovered was blunt ended with the Klenow 
fragment, subjected to the ligation reaction at 16°C overnight, and Escherichia coli JM109 was transformed. A single 
stranded DNA was prepared from each of various clones thus obtained with a helper phage M13K07. and subjected to 
the sequence reaction with a fluorescent primer cycle-sequence l<it available from Applied Biosystem (K.K.). and the 
20 DNA sequence was determined with an automatic sequencer. 

The DNA sequence comprising 2886 base pairs (bp) thus obtained is illustrated in Figs. 5 - 9 (SEQ ID NO: 4). As 
a result of examining an open reading frame having a rikx)Some binding site in front of the initiation codon. three open 
reading frames which can encode trie corresponding proteins (A - B (nucleotide positions 229 - 864 of SEQ ID NO: 4). 
C - D (nucleotide positions 864 - 1349). E - F (nucleotide positions 1349 - 2506) in Figs. 5 - 9) were found at the posi- 
es tions where the three xanthophyll synthesis genes crtW. crtZ and crtY are expected to be present. For the two open 
reading frames of A - B and E - F. 'the initiating codon is GTG. and for the remaining open reading frame C - D. it is ATG. 

Example 6: Identification of the orange pigment 

30 The lycopene-producing Escherichia coli JM101 having pAK92 or pAK96 introduced thereinto (Escherichia cofi 
(pACCRT-EIB. pAK92 or pAK96); exhibiting orange) or the p-carotene-producing Escherichia coli JM101 having pAK94 
or pAK96K (Fig. 12) introduced thereinto ( Escherichia coli (pACCARl6AcrtX, pAK94 or pAK96K); exhibiting orange) 
was cultured in 4 liters of a 2YT culture medium (1 .6% trypton, 1% yeast extract. 0.5% NaCI) containing 150 ixg/rrA of 
ampicillin (Ap. manufactured by Meiji Seika) and 30 \xQ'/m\ of chloramphenicol (Cm. manufactured by Sankyo) at 37''C 

36 for 18 hours. Bacterial cells collected from the culture solution was extracted with 600 ml of acetone, concentrated, 
extracted twice with 400 ml of chloroform/methanol (9/1), and concentrated to dryness. Then, thin layer chromatogra- 
phy (TLC) was conducted by dissolving the residue in a small amount of chloroform/methanol (9/1) and developing on 
a silica gel plate for prepara'iive TLC manufactured by Merck with chloroform/methanol (15/1). The original orange pig- 
ment was separated into three spots at the Rf values of 0.72, 0.82 and 0.91 by TLC. The pigment of the darkest spot at 

40 Rf 0.72 corresponding to 50% of the total amount of orange pigment and the pigment of secondly darker spot at Rf 0.82 
were scratched off from the TLC plate, dissolved in a small amount of chloroform/methanol (9/1) or methanol, and chro- 
matographed on a Sephadex LH-20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) or methanol to 
give purified materials in a yield of 3 mg (Rf 0.72) and 2 mg (Rf 0.82), respectively. 

It has been elucidated from the results of the UV-visible, ''H-NMR and FD-MS (m/e 696) spectra that the pigment 

45 at Rf 0.72 has the same planar structure as that of astaxanthin. When the pigment was dissolved in diethyl ether : 2- 
propanol : ethanol (5 : 5 : 2) to measure the CD spectrum, it was proved to have stereochemical configuration of 3S, 
3'S. and thus identified as astaxanthin: see Fig. 1 1 for the structural formula). Also, the pigment at Rf 0.82 was identified 
as phoenicoxanthin (see Fig. 11 for the structural formula) from the results of its UV-visible, ""H-NMR and FD-MS (m/e 
580) spectra. In addition, the pigment at 0.91 was canthaxanthin (Example 7(2)). 

50 

Example 7: Identification of metabolic intermediates of xanthoohvll 

(1) Identification of 4-ketozeaxanthin 

56 The zeaxanthin producing Escherichia coli was prepared according to the following procedure. That is to say. the 
plasmid pCAR25 having total carotenoid synthesis gene cluster of Er. uredorora (Misawa. N., Nakagawa. M., Koba- 
yashi, K.. Yamano. S.. Izawa. Y, Nakamura, K., Harashima, K., "Elucidation of the Enwinia uredovora Carotenoid Bio- 
synthetic Pathway by Functional Analysis of Gene Products expressed in Escherichia coli ". J. Bacteriol., 172. p. 6704- 
6712. 1990; and Japanese Patent Application No. 58786/1991 (Japanese Patent Application No. 53255/1990): "DNA 



14 



EPO 735 137 A1 




Strands useful for the Synthesis of Carotenoids") was digested with restriction enzyme BstEil. and subjected to the Kle- 
now fragment treatment and ligation reation to inactivate ttie crtX gene by reading frame shift, and then a 6.5 kb Asp 71 8 
( Kpn l) - Eco Rl fragment containing the £rtE, crtY. crtl . crtB and crtZ genes required for producing zeaxanthin was cut 
out. This fragment was then inserted into the Eco RV site of the Escherichia coli vector pACYCl84 to give the aimed 

5 plasmid (referred to hereinafter as pACCAR25AcrtX). 

The zeaxanthin-producing Escherichia coli JM101 having pAK910 or pAK916 (Fig. 12) introduced thereinto 
( Escherichia coli (pACCAR25AcrtX, pAK9iO or pAK916); exhibiting orange) was cultured in 2 liters of a 2YT culture 
medium containing 150 jjg/ml of Ap and 30 pg/ml of Cm at 37**C for 18 hours. Bacterial cells collected from the culture 
solution was extracted with 300 ml of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1). 

10 and concentrated to dryness. Then, thin layer chromatography (TLC) was conducted by dissolving the residue in a 
small amount of chloroform/methanol (9/1) and developing on a silica gel plate for preparative TLC manufactured by 
Merck with chloroform/methanol (15/1). The original orange pigment was separated into three spots at the Rf values' of 
0.54 (46%). 0.72 (53%) and 0.91 (1%) by TLC. The pigment at Rf 0.54 was scratched off from the TLC plate, dissolved 
in a small amount of chloroform/methanol (9/1) or methanol, and chromatographed on a Sephadex LH-20 column (15 

75 X 300 mm) with an eluent of chloroform/methanol (9/1) or methanol to give a purified material in a yield of 1 .5 mg. 

This material was identified as 4-ketozeaxanthin (see Fig. 1 1 for the structural formula) since its UV-visible spec- 
trum, FD-MS spectrum (m/e 582) and mobility in silica gel TLC (developed with chloroform/methanol (15/1)) accorded 
perfectly with those of the.standard sample of 4-keto2eaxanthin (purified from Aorobacterium aurantiacus sp. nov. MK1 ; 
Japanese Patent Application No. 70335/1993). In addition, the pigments at Rf 0.72 and 0.91 are astaxanthin (Example 

20 6) and canthaxanthin (Example 7 (2)). respectively. 

(2) Identification of canthaxanthip 

The p-carotene producing Escherichia coli JM101 having pAK910 or pAK916 introduced thereinto ( Escherichia coli 

25 (pACCAR16AcrtX. pAK910 ot pAK916); exhibiting orange) was cultured in 2 liters of a 2YT culture medium containing 
150 M9/rnl of Ap and 30 \iQ/m\ of Cm at 37*'C for 18 hours. Bacterial cells collected from the culture solution was 
extracted with 300 ml of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1), and concen- 
trated to dryness. Then, thin layer chromatography (TLC) was conducted by dissolving the residue in a small amount of 
chloroform/methanol (9/1) and developing on a silica gel plate for preparative TLC manufactured by Merck with chloro- 

30 form/methanol (50/1). The pigment of the darkest spot corresponding to 94% of the total amount of orange pigments 
was scratched off from the TLC plate, dissolved in a small amount of chloroform/methanol (9/1) or chloroform/methanol 
(1/1), and chromatographed on a Sephadex LH-20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) 
or chloroform/methanol (1/1) to give a purified material in a yield of 3 mg. 

This material was identified as canthaxanthin (see Fig. 11 for the structural formula) since its UV-visible. ''H-NMR. 

35 FD-MS (m/e 564) spectra and mobility in silica gel TLC (Rf 0.53 on developing with chloroform/methanol (50/1)) 
accorded perfectly with those of the standard sample of canthaxanthin (manufactured by BASF). In addition, the pig- 
ment corresponding to §%.of the total orange pigments found in the initial extract was considered echinenone (see Fig. 
1 1 for the structural forrtiula) on the basis of its UV- visible spectrum, mobility in silica gel TLC (Rf 0.78 on developing 
with chloroform/methanor(50/1)). and mobility in HPLC with NOVA PACK HR 6m C18 (3.9 x 300 mm; manufactured by 

40 Waters) (RT 16 minutes on developing at a flow rate of 1.0 ml/min with acetonitrile/methanol/2-propanol (90/6/4)). 

(3) Identification of zeaxanthin 

The p-carotene-producing Escherichia coli JM101 having pAK96NK introduced thereinto ( Escherichia coli 
45 (pACCARl6AcrtX. pAK96NK); exhibiting yellow) was cultured in 2 liters of a 2YT culture medium containing 150 pg/ml 
of Ap and 30 pg/ml of Cm at 37**C for 1 8 hours. Bacterial cells collected from the culture solution was extracted with 300 
ml of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1), and concentrated to dryness. 
Then, thin layer chromatography (TLC) was conducted by dissolving the residue in a small amount of chloroform/meth- 
anol (9/1) and developing on a silica gel plate for preparative TLC manufactured by Merck with chloroform/methanol 
£0 (9/1). The pigment of the darkest spot corresponding to 87% of the total amount of yellow pigments was scratched off 
from the TLC plate, dissolved in a small amount of chloroform/methanol (9/1) or methanol, and chromatographed on a 
Sephadex LH-20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) or methanol to give a purified 
material in a yield of 3 mg. 

It has been elucidated that this material has the same planar structure as that of zeaxanthin since its UV-visible, 
55 ""H-NMR, FD-MS (m/e 668) spectra and mobility in silica gel TLC (Rf 0.59 on developing with chloroform/methanol 
(9/1)) accorded perfectly with those of the standard sample of zeaxanthin (manufactured by BASF). When the pigment 
was dissolved in diethyl ether : 2-propanol : ethanol (5 : 5 : 2) to measure the CD spectrum, it was proved to have a 
stereochemical configuration of 3R. 3*R, and thus identified as zeaxanthin (see Fig. 1 1 for the structural formula). Also, 
the pigment corresponding to 13% of the total yellow pigments found in the initial extract was considered p-cryptoxan- 
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thin (see Fig. 1 1 for the structural formula) on the basis of its UV-visible spectrum, mobility in silica gel TLC (Rf 0.80 on 
developing with chloroform/methanol (9/1)). and mobility in HPLC with NOVA PACK HR 6m C18 (3.9 x 300 mm; manu- 
factured t)y Waters) (RT 19 minutes on developing at a flow rate of 1.0 ml/min with acetonitrile/methanol/2-propanol 
(90/6/4)). 

(4) Identification of p-carotene 

The iycopene-producing Escherichia coli JM101 having pAK98 introduced thereinto ( Escherichia coli (pACCRT- 
E IB. pAK98) ; exhibiting yellow) was cultured in 2 liters of a 2YT culture medium containing 1 50 MQ/ml of Ap and 30 ^g/ml 
70 of Cm at 37°C for 18 hours. Bacterial cells collected from the culture solution was extracted with 300 ml of acetone, con- 
centrated, and extracted twice with 200 ml of hexane. The hexane layer was concentrated and chromatographed on a 
silica gel column (15 x 300 mm) with an eluent of hexane/ethyl acetate (50/1) to give 3 mg of a purified material. 

The material was identified as p-carotene (see Fig. 1 1 for the structural formula), since all of the data of its UV-vis- 
ible, FD-MS spectrum (m/e 536) and mobility in HPLC with NOVA PACK HR 6p C18 (3.9 x 300 mm; manufactured by 
7£ Waters) (RT 62 minutes on developing at a flow rate of 1.0 ml/min with acetonitrile/methanol/2-propanol (90/6/4)) 
accorded with those of the standard sample of p-carotene (all trans type; manufactured by Sigma). 

Example 8: Identification of xanthophyll synthesis gene cluster 

20 (1 ) Identification of a keto group-introducing enzyme gene 

It is apparent from the results of Example 6 that among the 3.9 kb fragment contained in pAK9 (Example 4) or 
pAK92, all of the genes required for the synthesis of astaxanthin from lycopene is contained in the 2.9 kb Bam HI frag- 
ment at the right side (pAK96. Fig. 12). Thus, the 1.0 kb fragment at the left side is not needed. Unique Nco l and Kpn l 

25 sites are present within the 2.9 kb Bam HI fragment of pAK96. It is found from the results of Example 7 (3) that the 1 .4 
kb fragment (pAK96NK) between the Ncol and Kpn l sites has a hydroxyl group-introducing enzyme activity but has no 
■ keto group-introducing enzyme activity. Canthaxanthin can also be synthesized from p-carotene with the 2.9 kb BamHI 
fragment from which a fragment of the right side from unique Sail site between the Nco l and Kpn l sites had been 
removed (pAK910) or with the 2.9 kb BamHI fragment from which a fragment of the right side from the Hindi site posi- 

30 tioned at the left side of the Sail site had been removed (pAK9l6). but activity for synthesizing canthaxanthin from p- 
carotene disappeared in the 2.9 kb Bam HI fragment of pAK96 from which a fragment of the right side from the Nco l site 
left of the Hindi site had been removed. On the other hand, even if a fragment of the left side from unique BglM site 
which is present leftward within the 0.9 kb Bam HI - Hindi fragment of pAK916 was removed, similar activity to that of 
the aforementioned Bam HI - Hind i fragment (pAK9l6) was observed. It is thus considered that a gene encoding a keto 

35 group-introducing enzyme having an enzyme activity for synthesizing canthaxanthin from p-carotene as a substrate is 
present within the 0.74 kb Bglll - Hindi fragment of pAK9l6, and the aforementioned Nco l site is present within this 
gene. As a result of determining the nucleotide sequence, an open reading frame which corresponds to the gene and 
has a ribosome binding site just in front of the initiation codon was successfully detected, it was then designated as the 
crtW gene. The nucleotide sequence of the crtW gene and the encoded amino acid sequence are illustrated in Fig. 1 

40 (SEC ID NO: 1). 

The crtW gene product (CrtW) of Aorobacterium aurantiacus sp. nov. MKl has an enzyme activity for converting a 
methylene group at the 4-position of a p-ionone ring into a keto group, and one of the specific examples is an enzyme 
activity for synthesizing canthaxanthin from p-carotene as a substrate by way of echinenone (Example 7 (2); see Fig. 
11). Furthermore, the crtW gene product also has an enzyme activity for converting a methylene group at the 4-position 

46 Of a 3-hydroxy-p-ionone ring into a keto group, and one of the specific examples is an enzyme activity for synthesizing 
astaxanthin from zeaxanthin as a substrate by way of 4-ketozeaxanthin (Example 7 (1); see Fig. 11). In addition, 
polypeptides having such enzyme adivities and DNA strands encoding these polypeptides have not hitherto been 
known, and the polypeptides and the DNA strands encoding these polypeptides have no overall homology to any 
polypeptides or DNA strands having been hitherto known. Also, no such informations have hitherto been described that 

50 a methylene group of not only a p-ionone ring and a 3-hydroxy-p-ionone ring but also the other compounds is directly 
converted into a keto group with an enzyme. 

(2) Identification of a hydroxyl group-introducing enzyme gene 

55 Unique Sail site Is present within the 2.9 kb Bam HI fragment of pAK96. When the 2.9 kb Bam HI fragment is cut into 
two fragments at the Sail site, these two fragments (pAK9lO and pAK98) have no hydroxyl group-introducing activity. 
That is to say, the left fragment (pAK9lO) has only a keto group-introducing enzyme activity (Example 7 (2)), and the 
right fragment (pAK98) has only a lycopene-cyclizing enzyme activity (Example 7 (4)). On the other hand, when a 1.4 
kb Nco l - Kpn l fragment (pAK96NK) containing the aforementioned £a!l site is introduced into a p-carotene-producing 
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/nthesized by way of p-cryptoxanthin (Example 7 (3)). It is thus considered that a gene 

encoding a hydrbxyl group-introducing enzyme which has an enzyme activity for synthesizing zeaxanthin from p-caro- 
tene as a substrate Is present within the ^ .4 kb Nco i - Kpn l fragment of pAK96NK, and the aforementioned SMI site is 
present within this gene. As a result of determining the nucleotide sequence, an open reading frame which corresponds 
£ to the gene and has a ribosome binding site just in front of the initiation codon was successfully detected, it was then 
referred to as the crtZ gene. The nucleotide sequence of the crtZ gene and the encoded amino acid sequence are Illus- 
trated in Fig. 2 (SEQ ID NO: 2). 

The crtZ gene product (CrtZ) of Aarobacterium aurantiacus sp. nov. MK1 has an enzyme activity for adding a 
hydroxyl group to the 3-carbon of a p-ionone ring, and one of the specific examples is an enzyme activity for synthesiz- 
76 ing zeaxanthin from p-carotene as a substrate byway of p-cryptoxanthin (Example 7 (3); see Fig. 11). Furthermore, the 
crtZ gene product also has an enzyme activity for adding a hydroxyl group to the 3-carbon of a 4-keto-p-ionone ring, 
and one of the specific examples is an enzyme activity for synthesizing astaxanthin from canthaxanthin as a substrate 
by way of phoenicoxanthin (Example 6; see Fig. 11). In addition, polypeptides having the latter enzyme activity and 
DNA strands encoding these polypeptides have not hitherto been known. Also, the CrtZ of Agrobacterium showed sig- 
15 nif icant homology to the CrtZ of Erwinia uredovora (identity of 57%) at the level of amino acid sequence. 

(3) Identification of a lycopene cyclase gene 

Astaxanthin can be synthesized from p-carotene with the 2.9 kb Bam HI fragment from which a fragment of the right 
20 side from a Kpn l site had been removed (pAK96K) or with the 2.9 kb Bam HI fragmejnt from which a fragment right from 
the PstI site which Is placed further right of the Kpn l site had been removed (pAK94) (Example 6), but astaxanthin can- 
not be synthesized from lycopene. On the other hand, when a 1 .6 kb Sail fragment (pAK98), which contains a right frag- 
ment from unique Sail site present further left than the aforementioned Kpn l site within the 2.9 kb Bam HI fragment, was 
introduced into lycopene-producing Escherichia coli . p-carotene was synthesized (Example 7 (4)). It is thus considered 
26 that a gene encoding lycopene cyclase that has an enzyme activity for synthesizing p-carotene from lycopene as a sub- 
strate is present within the 1 .6 kb Sail fragment of pAK98, and this gene is present over a range of the Kpn l site and the 
Pst I site. As a result of determining the nucleotide sequence, an open reading frame which corresponds to the gene 
and has a ribosome binding site just in front of the initiation codon was successfully detected, it was then referred to as 
the crtY gene. The nucleotide sequence of the crtY gene and the amino acid sequence to be encoded are illustrated In 
50 Figs. 3 - 4 (SEQ ID NO: 3). 

The crtY gene product (CrtY) of Aarobacterium aurantiacus sp. nov. MK1 has significant homology to the CrtY of 
Erwinia uredovora (identity of 44.3%) at the level of amino acid sequence, and the functions of both enzymes are the 
same. 

35 Example 9: Southern blotting analy sis with the chromosomal DNA of the other marine bacteria 

Examination was conducted whether a region exhibiting homology with the isolated crtW and crtZ is obtained from 
a chromosomal DNAs of the other marine microorganisms. The chromosomal DNAs of Alcaliaenes sp. PC-1 and 
Alteromonas sp. SD-402 prepared in Example 1 were digested with restriction enzymes Bam HI and EsJI. and sepa- 

40 rated by agarose gel electrophoresis. All of the DNA fragments thus separated were denaturated with an alkali solution 
of 0.5 N NaOH and 1.5 M NaCi, and transferred on a nylon membrane filter over an overnight period. The nylon mem- 
brane filter on which DNAs had been adsorbed was dipped in a hybridization solution (6 x Denhardt. 5 x SSC. 100 ^gM 
ssDNA), and pre-hybridization was conducted at 60°C for 2 hours. Next, the 1 .5 kb DNA fragment cut out f rom pAK96K 
with Ball, which contains crtW and crtY. was labelled with a Mega prime^" DNA labelling systems (Amersham) and [a- 

45 ^^P]dCTP (- 1 1 0TBq/mmol) and added to the aforementioned prehybridization solution to conduct hybridization at 60**C 
for 16 hours. 

After hybridization, the filter was washed with 2 x SSC containing 0.1% SDS at 60**C for 1 hour, and subjected to 
the detection of signals showing homology by autoradiography. As a result, strong signals were obtained at about 13 kb 
in the product digested with Bam HI and at 2.35 kb In the product digested with Psll in the case of Alcaliaenes sp. PC- 
50 1 , and strong signals were obtained at about 5.6 kb in the product digested with Bam HI and at 20 kb or more in the prod- 
uct digested with P§tl in the case of Alteromonas sp. SD-4. 

Example 10: Acquisition of a xanthophyll synthesis oene cluster from the other marine bacterium 

55 As it was found from the results of Example 9 that the PstI digest of the chromosomal DNA of Alcaliaenes sp. PC- 
1 has a region of about 2.35 kb hybridizing with a DNA fragment containing the crtW and crtZ genes of Aarobacterium 
aurantiacus sp. nov. MK1, the chromosomal DNA of Alcaliaenes was digested with PstI, and then DNA fragments of 2 
- 3.5 kb in size was recovered by agarose gel electrophoresis. The DNA fragments thus collected were inserted into the' 
Pst I site of a vector pBluescript II SK+. and introduced into Escherichia coli DH5a to prepare a partial library of Alcali- 
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genes . When the partial library was subjected to colony hybridization with a 1 .5 kb DNAiragment containing the crtW 
and crtZ genes of Aarobacterium as a probe, a positive colony was isolated from about 5,000 colonies. In this case, 
colony hybridization was conducted under the same condition as in Ihe Southern blotting analysis shown in Example 9. 
When plasmid DNA was isolated from the colony thus obtained, and digested with PstI to examine the size of the inte- 

5 grated DNA fragments, it was found that the plasmid contained three different fragments. Thus, a 2.35 kb fragment to 
be hybridized was selected from the three different DNA fragments by the Southern blotting analysis described in 
Example 9, the 2.35 kb PstI fragment was recovered by agarose gel electrophoresis and inserted again into the Pst I 
siteofpBluescript II SK+ to prepare the plasmids pPCii and pPCl2. In pPCll and pPCl2,the aforementioned 2.35 
kb Esll fragment was inserted into the Psll site of pBluescript II SK+ in an opposite direction to each other. The restric- 

70 tion enzyme map of pPC1 1 is illustrated in Fig. 19. 

Example 11 : Determination of nucleotide sequence of xanthophvll synthesis gene cluster in Atcalioenes 

When each of pPCl 1 and pPCl2 was introduced into p-carotene-producing Escherichia coli . orange colonies were 
75 obtained due to the synthesis of astaxanthin (Example 1 2) in the former, but no other pigments were newly synthesized 
in the latter. It was thus considered that the direction of the astaxanthin synthesis gene cluster in the plasmid pPCl 1 
was the same as that of the vector lac promoter. It was also found that pPCl 1 contained no lycopene cyclizing enzyme 
genes, since no other pigments were newly produced even if pPCll was introduced into the lycopene-producing 
Escherichia coli . ' . ' 

20 It was found that even if a plasmid having a 0.72 kb BstEll - Eco RV fragment positioned at the right side of the PstI 
fragment had been removed (referred to as pPCl7, Fig. 19) was introduced into the p-carotene-producing Escherichia 
coli . the transformant of Escherichia coli synthesized astaxanthin and the like (Example 12). same as in the case of £1 
coli into which pPCl 1 was introduced, so that the nucleotide sequence of the 1.63 kb EstI - BstEll fragment in pPC17 
was determined. 

2S Deletion mutants were prepared with pPCl7 and pPCl2 according to the following procedure. A 10 mO portion of 
each of pPCl7 and pPC12 was. digested with Kon l and Hindlll or' Kpn l and EcqRI, extracted with phenol/chloroform, 
and DNA was recovered by precipitation with ethanol. Each of DNAs was dissolved in 100 \s\ of Exolll buffer (50 mM 
Tris-HCI, 100 mM NaCI. 5 mM MgClg. 10 mM 2-mercaptoethanol, pH 8.0), 180 units of Exfilll nuclease was added, and 
the mixture was maintained at 37°C. A 10 pi portion was sampled at every 1 minute, and two samples were transferred 

so into a tube in which 20 pi of an MB buffer (40 mM sodium acetate, 100 mM NaCI, 2 mM ZnCIs, 10% glycerol, pH 4.5) 
is contained and which is placed on ice. After completion of the sarhpling, five tubes thus obtained were maintained at 
65^0 for 10 minutes to inactivate the enzyme, five units of mung bean nuclease were added, and the mixture was main- 
tained at 37''C for 30 minutes. After the reaction, ten DNA fragments different from each other in the degrees of deletion 
were recovered for each plasmid by agarose gel electrophoresis. The DNA fragments thus recovered were blunt ended 

35 with the Klenow fragment, subjected to the ligation reaction at IG'^C overnight, and Escherichia coli JM109 was trans- 
formed. A single stranded DNA was prepared from each of various clones thus obtained with a helper phage M13K07, 
and subjected to the sequence reaction with a fluorescent primer cycle-sequence kit available from Applied Blosyslem 
(K.K.), and the DNA sequepcife was determined with an automatic sequencer. 

The DNA sequence cooprising 1631 base pairs (bp) thus obtained is illustrated in Figs. 16 - 18 (SEQ ID NO: 7). 

40 As a result of examining an open reading frame having a ribosome binding site in front of the initiating codon, two open 
reading frames which can encode the corresponding proteins (A - B (nucleotide positions 99 - 824 of SEQ ID NO: 7), 
C - D (nucleotide positions 824 - 1309) in Figs. 16 - 18 were found at the positions where the two xanthophyll synthesis 
genes crtW and crtZ were expected to be present, 

45 Example 12: Mentif ication of pigments produced bv Escherich ia coli havino an Aicalioenes xanthophvll synthesis gene 
cluster 

(1) Identification of astaxanthin and 4-ketozeaxanthin 

so A deletion plasmid (having only crtW) having a deletion from the right BstEll to the nucleotide position 1 162 (Fig. 
1 7) (nucleotide position 1 1 62 of SEQ ID NO: 7) among the deletion plasmids from pPCl 7 prepared in Example 1 1 was 
referred to as pPCl7-3 (Fig. 19). 

The zeaxanthin-producing Escherichia coli JMiOi (Example 7 (1)) having pPCl7-3 introduced thereinto 
( Escherichia coli (pACCAR25AcrtX, pPC17-3); exhibiting orange) was cultured in 2 liters of 2YT culture medium con- 

55 taining 150 pg/ml of Ap and 30 pg/ml of Cm at 37*'C for 18 hours. Bacterial cells collected from the culture solution was 
extracted with 300 ml of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1), and concen- 
trated to dryness. Then, thin layer chromatography (TLC) was conducted by dissolving the residue in a small amount of 
chloroform/methanol (9/1 ) and developing on a silica gel plate for preparative TLC manufactured by Merck with chloro- 
form/methanol (15/1). The original orange pigment was separated into three spots at the Rf values of 0.54 (ca. 25%), 
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0.72 (ca. 30%) and 0.91 (ca. 25%). The pigments at the Rf values of 0.54 and 0.72 were scratched off from the TLC 
plate, dissolved in a small amount of chloroform/methanol (9/1) or methanol, and chromatographed on a Sephadex LH- 
20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) or methanol to give purified rtwterials in a yield 
of about 1 mg, respectively. 

5 The materials were identified as 4-keto2eaxanthin (Rf 0.54) and astaxanthin (Rf 0.72), since all of the data of their 

UV-visible. FD-MS spectra and mobility in TLC (developed with chloroform/methanol (15/1)) accorded with those of the 
standard samples of 4-ketozeaxanthin and astaxanthin. In addition, the pigment at the Rf value of 0.91 was canthaxan- 
thin (Example 12 (2)). 

It was also confirmed by the similar analytical procedures that the p-carotene-producing Escherichia coli JMlOl 
10 having pPCl 1 or pPCl 7 introduced thereinto ( Escherichia coli (pACCARl6AcrtX, pPCl 1 or pPCl7) (exhibiting orange) 
produces astaxanthin, 4-ket02eaxanthin and canthaxanthin. Furthermore. 11 was also confirmed with the authentic sam- 
ple of phoenicoxanthin obtained in Example 6 that these E. coli transformants produce a trace amount of phoenicoxan- 
thin. 

75 (2) identification of canthaxanthin 

The P'Carotene-producino Escherichia coli JM101 having pPCl7-3 introduced thereinto (Escherichia coli 
(pACCARl6AcrtX, pPCl7-3); exhibiting orange) was cultured in 2 liters of 2YT culture medium containing 150 pg/ml of 
Ap and 30 pg/ml of Cm at 37°C for 18 hours. Bacterial cells collected from the culture solution was extracted with 300 

20 ml of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1), and concentrated to dryness. 
Then, thin layer chromatography (TLC) was conducted by dissolving the residue in a small amount of chloroform/meth- 
anol (9/1) and developing on a silica gel plate for preparative TLC manufactured by Merck with chloroform/methanol 
(50/1). The darkest pigment cprresponding to 40% of the total amount of orange pigments was scratched off from the 
TLC plate, dissolved in a small amount of chloroform/methanol (9/1) or chloroform/methanol (1/1). and chromato- 

25 graphed on a Sephadex LH-20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) or chloroform/meth- 
anol (1/1 ) to give a purified material in a yield of 2 mg. 

The material was identified as canthaxanthin. since all of the data of its UV-visible, FD-MS (m/e 564) spectra and 
mobility in TLC (developed with chloroform/methanol (50/1)) accorded with those of the standard sample of canthaxan- 
thin (manufactured by BASF). In addition, the pigment of which amount corresponds to 50% of the total amount of the 

30 orange pigments observed In the initial extract was considered to be echinenone from its UV-visible spectrum, mobility 
in silica get TLC (developed with chloroform/methanol (50/1)). and mobility in HPLC with NOVA PACK HR 6m CIS (3.9 
X 300 mm; manufactured by Waters) (developed with acetonitrile/methanol/2-propanol (90/6/4)) (Example 7 (2)). In 
addition, the balance of the extracted pigments, 10%. was unreacted p-carotene. 

35 (3) Identification of zeaxanthin 

A plasmid having a 1 .15 kb Sail fragment within pPCl 1 inserted In the same direction as the plasmid pPCl 1 into 
the Sail site of pBluescript II SK+ was prepared (referred to as pPCl3, see Fig. 19). 

The B-carotene-oroducino Escherichia coli JM101 having pPCl3 introduced thereinto ( Escherichia coli 

40 (pACCARlSAcrlX. pPCl3): exhibiting yellow) was cultured In 2 liters of 2YT culture medium containing 150 MQ/m! of Ap 
and 30 pg/ml of Cm at 37*'C for 18 hours. Bacterial cells collected from the culture solution was extracted with 300 mi 
of acetone, concentrated, extracted twice with 200 ml of chloroform/methanol (9/1). and concentrated to dryness. Then, 
thin layer chromatography (TLC) was conducted by dissolving the residue in a small amount of chloroform/methanol 
(9/1) and developing on a silica gel plate for preparative TLC manufactured by Merck with chloroform/methanol (9/1). 

45 The darkest pigment corresponding to 90% of the total amount of orange pigments was scratched off from the TLC 
plate, dissolved in a small amount of chloroform/methanol (9/1) or methanol, and chromatographed on a Sephadex LH- 
20 column (15 x 300 mm) with an eluent of chloroform/methanol (9/1) or methanol to give a purified material in a yield 
of 3 mg. 

The material was identified as zeaxanthin. since all of the data of its UV-visible. FD-MS (m/e 568) spectra and 
50 mobility in TLC (developed with chloroform/methanol (9/1)) accorded with those of the standard sanple of zeaxanthin 
(Example 7 (3)). In addition, the pigment of which amount corresponds to 10% of the total amount of the orange pig- 
ments observed In the initial extract was considered to be p-cryptoxanthin from Its UV-vlslble spectrum, mobility In silica 
gel TLC (developed with chloroform/methanol (9/1)). and mobility in HPLC with NOVA PACK HR 6m Cl8 (3.9 x 300 mm; 
manufactured by Waters) (developed with acetonitrile/methanol/2-propanol (90/6/4)) ( Example 7 (3)). 

55 
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Example 13: Identification of the Alcaligenes xanthophvll synthesis gene. cluster 

(1 ) Identification of a keto group-introducing enzyme gene 

5 It is apparent from the results of Examples 1 1 and 12(1) that all of the genes required for the synthesis of astax- 

anthin from p-carotene among the 2.35 kb PstI fragment contained in pPCl 1 is contained in the 1.63 kb PstI - BstEII 
fragment (pPCl 7, Fig. 19) in the left side. Thus, the 0.72 kb BstEII - PstI fragment in the right side is not needed. Unique 
Smal and Sail sites are present within the 1.63 kb PstI - BstEII fragment of pPCl7 (Fig. 19). It is confirmed by the pig- 
ment analysis with a p-carotene-producing Escherichia coli having the deletion plasmids introduced thereinto that the 

-io keto group-introducing enzyme activity was lost when the 0.65 kb and 0.69 kb fragments at the left side from Sma l and 
. Sail sites were removed. It was also confirmed by the pigment analysis with a p-carotene-producing Escherichia coli 
having the plasmid introduced thereinto that the plasmid having a 0.69 kb PstI - Sail fragment positioned at the left side 
of the 1 .63 kb Psti - BstEII fragment inserted into the Psti - Safl site of pBluescript SK-f has no keto group- introducing 
enzyme activity. On the other hand, the deletion plasmid pPCl7-3 (Fig. 19) in which deletion from the BstEII end at the 

7£ right end to the nucleotide No. 1162 (nucleotide position 11 62 in SEQ ID NO: 7) occurred has a keto group- introducing 
enzyme activity (Example 1,2 (1), (2)). so that it is considered a gene encoding a keto group-introducing enzyme having 
an enzyme activity for synthesizing canthaxanthin or astaxanthin with a substrate of p-carotene or zeaxanthin is present 
in the 1162 bp fragment \r\ pPCl7-3, and the aforementioned Sma l and Sail sites are present within this gene. As a 
result of determining the nucleotide sequence, an open reading frame which corresponds to the gene and has a ribos- 

20 ome binding site just in front of the initiation codon was successfully detected, so that it was referred to as the crtW 
gene. The nucleotide sequence of the crtW gene and the encoded amino acid sequence are illustrated in Figs. 13-14 
(SEQ ID NO: 5). 

The crtW gene product (CrtW) of Alcaligenes sp. PC-1 has an enzyme activity for converting a methylene group at 
the 4-position of a p-ionone ring into a keto group, and one of the specific examples is an enzyme activity for synthesiz- 
es ing canthaxanthin from p-carotene as a substrate by way of echinenone (Example 12 (2); see Fig. 11). Furthermore, 
the crtW gene product also has ati enzyme activity for converting a methylene group at the 4-position of a 3-hydroxy-p- 
ionone ring into a keto group, and one of the specific examples is an enzyme activity for synthesizing astaxanthin from 
zeaxanthin as a substrate by way of 4-ket02eaxanthin (Example 12 (1); see Fig. 11). In addition, polypeptides having 
such enzyme activities and DNA strands encoding these polypeptides have not hitherto been known, and the polypep- 
30 tides and the DNA strands encoding these polypeptides have no total homology to any polypeptides or DNA strands 
having been hitherto known. Also, the crtW gene products (CrtW) of Aorobacterium aurantiacus sp. nov. MKI and Alca- 
ligenes sp. PC-1 share high homology (identity of 83%) at the level of amino acid sequence, and the functions of both 
enzymes are the same. The amino acid sequence in the region of 17% having no identity among these amino acid 
sequences is considered not so significant to the functions of the enzyme. It is thus considered particularly in this region 
55 that a little amount of substitution by the other amino acids, deletion, or addition of the other amino acids will not afftect 
the enzyme activity. 

It can be said the keto group-introducing enzyme gene crtW of marine bacteria encodes the p-ionone or 3-hydroxy- 
p-ionone ring ketolase which* converts directly the methylene group at the 4-position into a keto group irrelative to 
whether a hydroxyl group is added to the 3-position or not. In addition, no such informations have hitherto been 
40 described that a methylene group of not only a p-ionone ring and a 3-hydroxy-p-ionone ring but also the other com- 
pounds is directly converted into a keto group with one enzyme. 

(2) Identification of a hydroxyl group-introducing enzyme gene 

45 All of the genes rerquired for the synthesis of astaxanthin from p-carotene is contained in the 1 .63 kb PstI - BstEII 

fragment (Fig. 19) of pPCl 7. One Sail site is present within the 1 .63 kb Psil - BslEII fragment of pPCl 7. It is apparent 
from the results of Example 12 (3) that a hydroxyl group-introducing enzyme activity is present in a fragment at the right 
side from the Sail site. It is thus understood that the hydroxyl group-introducing enzyme activity is present in the 0.94 
kb Sa|l - BstEII fragment which is the right fragment in the 1 .63 kb PstI - BstEII fragment. As a result of determining the 

50 nucleotide sequence, an open reading frame which corresponds to the gene and has a ribosome binding site just in 
front of the initiation codon was successfully detected, it was referred to as the crtZ gene. The nucleotide sequence of 
the crtZ gene and the encoded amino acid sequence are illustrated in Fig. 15 (SEQ ID NO: 6). 

The crtZ gene product (CrtZ) of Alcaligenes sp. PC-1 has an enzyme activity for adding a hydroxyl group to the 3- 
carbon of a p-ionone ring, and one of the specific examples is an enzyme activity for synthesizing zeaxanthin from p- 

55 carotene as a substrate by way of p-cryptoxanthin (Example 12 (3); see Fig. 11). Furthermore, the crtZ gene product 
also has an enzyme activity for adding a hydroxyl group to the 3-carbon of a 4-keto-p-ionone ring, and one of the spe- 
cific examples is an enzyme activity for synthesizing astaxanthin from canthaxanthin as a substrate by way of phoeni- 
coxanthin (Example 12 (1); see Fig. 11). In addition, polypeptides having the latter enzyme activity and DNA strands 
encoding these polypeptides have not hitherto been known. Also, the CrtZ of Alcaligenes sp. PC-1 showed significant 
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homology to the CrtZ of Ei^Wia uredovora (identity ot 58%) at the level of amino acid sequence. In addition, the crtZ 
gene products (CrtZ) of Aprobacterium aurantiacus sp.' riov. MKa and Alcallqenes sp. PC-1 have high homology (iden- 
tity of 90%) at the level of amino acid sequence, and the functions of both enzymes are the same. The amino acid 
sequence in the region of 10% having no identity among these amino acid sequences is considered not so significant 
5 to the functions of the enzyme. It is thus considered particularly in this region that a little amount of substitution by the 
other amino acids, deletion, or addition of the other amino acids will not afftect the enzyme activity. 

(3) Consideration on minor biosynthetic pathways of xanthophylls 

70 It has been elucidated by our studies with carotenoid synthesis genes of the epiphytic bacterium Enwinia or the pho- 
tosynthetic bacterium Rhodobacter that carotenoid biosynthesis enzymes generally act by recognizing the half of a car- 
otenoid molecule as a substrate. By way of example, the lycopene cyclase gene of Enwinia . crtY recognizes the halves 
of the lycopene molecule to cyclize it. When the phytoene desaturase gene crtj of Rhodobacter was used for the syn- 
thesis of neurosporerie in place of lycopene in Escherichia coli and crtY of Erwinia was allowed to work on it, the crtY 

IS gene product recognizes the half molecular structure common to lycopene to produce a half cyclized p-zeacarotene 
(Linden. H., Misawa, N*., Chamovits, D., Pecher. I.. Hirschberg, J., Sandmann. G., "Functional Complementation in 
Escherichia coli of Different Phytoene Desaturase Genes and Analysis of Accumulated Carotenes", Z. Naturforsch., 
46c. p. 1045-1051, 1991). Also, in the .present invention, when CrtW is .allowed to work on p-carotene or zeaxanthjn. 
echinenone or 4-ketozeaxanthin in which one keto group has been introduced is first synthesized, and when CrtZ is 

20 allowed to work on p-carotene or canthaxanthin, p-cryptoxanthin or phoenicoxanthin in which one hydroxyl group has 
been introduced is first synthesized. It can be considered because these enzymes recognize the half molecule of the 
substrate. Thus, while Escherichia coli having the crtE . crtB. prti and crtY genes of Enwinia and the crtZ gene of a 
marine bacterium produces zeaxanthin as described above, p-cryptoxanthin which is p-carotehe having one hydroxyl 
group introduced thereinto can be detected as an intermediate metabolite. It can be thus considered that if CrtW is 

25 present. 3'-hydroxyechinenone pr 3-hydroxyechinenone can be synthesized from p-cryptoxanthin as a substrate, and 
that phoenicoxanthin cian be'further synthesized by the action of CrtW on these intermediates. The present inventors 
have not identified these ketocarotenoids in the culture solutions, and the reason is considered to be that only a trace 
amount of these compounds is present under the conditions carried out in the present experiments. In fact, it was 
described that 3-hydroxyechinenone or 3'-hydroxyechinenone was detected as a minor intermediate metabolite of 

30 astaxanthin in a marine bacterium Aprobacterium aurantiacus sp. nov. MKl as a genie source (Akihiro Yokoyama ed., 
"For the biosynthesis of astaxanthin in marine bacteria". Nippon Suisan Gakkai, Spring Symposium, 1994. Abstract, p. 
252. 1994). It can be considered from the above descriptions that minor metabolic pathways shown in Fig. 20 are also 
present in addition to the main metabolic pathways of astaxanthin shown in Fig. 11. 

35 Industrial Applicability 

According to the pr^ent invention, the gene clusters required for the biosynthesis of keto group-containing xantho- 
phylls such as astaxanthin, phoenicoxanthin. 4-ketozeaxanthin. canthaxanthin and echinenone have successfully been 
obtained from marine bacteria, and their structures, nucleotide sequences, and functions have been elucidated. The 
40 DNA strands according to the present invention are useful as genes capable of affording the ability of biosynthesis of 
keto group-containing xanthophylls such as astaxanthin to microorganisms such as Escherichia coli and the like. 
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SEQUENCE LISTING 
— I ; 

SEQ ID NO: 1 

SEQUENCE LENGTH: 639 

S£QUENCE TYPE: STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE: 

ORGANISM: Agrobacterium aurantiacus 

STRAIN: sp. nov. MKl 

SEQUENCE 

GTG CAT GCG CTG TGG TTT CTG GAC GCA GCG GCG CAT CCC ATC CTG GCG 48 

Mel His AU Leu Tip Phe Leu Asp Ala Ala Ala His Pro lie Leu Ala 

1 5 10 • 15 

ATC GCA AAT TTC CTG GGG CTG ACC TGG CTG TCG GTC GGA TTG TTC ATC 96 

He Ala Asn Phe Leu Gly Leu Thr Tip Leu Ser Val Gly teu Pbe lie 

20 25 30 

ATC GCG CAT GAC GCG ATG CAC GGG TCG GTG GTG CCG GGG CGT CCG CGC 144 

lie Ala His Asp Ala Met His Gly Ser Yal Yal Pro Gly Arg Pro Arg 

35 40 45 

GCC AAT GCG GCG ATG GGG CAG CTT GTC CTG TGG CTG TAT GCC GGA TTT 192 

Ala Asn Ala Ala Met Gly Gin Leu Yal Leu Trp Leu Tyr Ala Gly Pbe 

50 55 60 
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TCC TGG CGC AAG ATG ATC GTC AAG CAC ATG GCC CAT CAC CGC CAT GCC . 240 
Ser Trp Arg Lys Met lie Val Lys His Mf I AU His His Atg His Ale 
.65 70 . 75 80 

GGA ACC GAC GAC GAG CCC GAT TTC GAG CAT GGC GGC CCG GTC CGC TGG 288 
Gly Tbr Asp Asp Asp Pro. Asp Pbc Asp His Gly Gly Pro Ycl Arg Trp 
85 go 95 

■ TAC GCC CGC TTC ATC GGC ACC TAT TTC GGC TGG CCC GAG GGG CTG CTG 336 
Tyr Alz Atg Phe lie Gly. Tbr Tyr Phe Gly Trp Arg Glu Gly Leu Leu 

100 ■' . 105 .110 

CTG CCC GTC ATC GTG ACG GTC-'TAT GCG CTG ATC CTT GGG GAT CGC TGG 384 
Leu Pro Vcl He Vsl Tbr Vcl Tyr Ala Leu lie Leu Gly Asp Arg Trp 

115 120 • 125- 

ATG TAC GTG GTC TTC TGG. CCG CTG CCG TCG ATC CTG GCG TCG ATC CAG 432 
Met Tyr YjI Val Pbe Trp Pro Leu Pro Ser lie Leu Ala Set lie Gin 

130 135 140 

CTG TTC GTG TTC GGC ACC TGG CTG CCG CAC CGC CCC GGC CAC GAC GCG 480 
Leu Phe Val Phe Gly Tbr Trp Leu Pro His Arg Pro Gly His Asp Ala 
145, 1 50 1 55 1 60 

TTC CCG GAC CGC CAC AAT GCG CGG TCG TCG CGG ATC AGC GAC CCC GTG 528 
Pbe Pro Asp Arg His Asn Ala Arg Ser Ser Arg He Ser Asp Pro Val. 

165 no . 175 

TCG CTG CTG ACC TGC TTT CAC TTT GGC GGT TAT CAT CAC GAA CAC CAC 576 
Ser Leu Leu Tbr Cys Pbe His Pbe Gly Gly Tyr His His Glu His His 
180 185 190 
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CTG CAC CCG ACG GTG CCG TGG TGG CGC dC CCC AGC ACC CGC ACC AAG 

Leu His Pro Tbr Yal Pro Trp Trp Arg Leu Pro Ser Thr Arg Thr Lys 

195 200 205 

GGG GAC ACC GCA TGA 

Gly Asp Thr Ala Hi 
210 



624 



639 



20 



25 



30 



35 



40 



45 



SO 



55 



24 



10 



IS 



so 



26 



35 



EP 0 735 137 A1 



SEQ ID NO: 2 

SEQUENCE LENGTH: 489 

SEQUENCE TYPE: STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE: 

ORGANISM: . Agrobacterium aurantiacus 
STRAIN: sp. nov. MKl 

SEQUENCE 



ATG ACC AAT TTC CTG' ATC GTC GTC GCC ACC. GTG CTC GTG ATG GAG TTG \ 48 

Met Tbr Asn Phe Leu He Vcl Val Ala Thr Yal Leu Val Met GIu Leu 

1 '5. 10 15 

ACG GCC TAT TCC GTC CAC CGC TGG ATC ATG CAC GGC CCC CTG GGC TGG 96 

Thr Ala Tyr Ser Yal His Arg Tip lie Met His Gly Pro Leu Gly Trp 

20 25 30 

GGC TGG CAC AAG TCC CAC CAC GAG GAA CAC GAC CAC GCG CTG GAA AAG , 144 

Gly Tip His 'Lys Ser His His Glu Giu His Asp His AU Leu Glu Lys 

35 40 45 

AAC GAC CTG TAC GGC CTG. GTC TTT GCG GTG ATC GCC ACG GTG CTG TTC 192 

45 Asn Asp Leu Tyr Gly Leu Yal Phe Ala Yal He Ala Thr Yal Leu Phe 
50 55 60 



SO 



55 



25 



LF : 735 .37 Al 



10 



IS 



20 



25 



30 



36 



40 



45 



ACG GTG 
Tbr Yal 
65 

ATG ACT 
Met Tbr 

CAT' CAG 
His Gin 

CGC CTG 
Aig Leu 

CAT TGC 
His Cys 
130 
AAG CAG 
Lys Gin 
145 

CGC ACG 
Ar? Thr 



GGC TGG 
Gly Trp 

GTC TAT 
Val Tyr 

CGC TGG 
Arg Trp 
100 
TAT CAG 
Tjr Gin 
115 

GTC AGC 
Val Ser 



ATC TGG 
l.le Trp 

70 

GGG CTG 
Gly Leu 
85 

CCG TTC 
Pro Pbe 

GCC CAC 

Ala His 

TTC GGC 
Pbe Gly 



GAC CTG AAG ATG 
Asp Leu lys Met 

150 

TGA 



GCG CCG 

Ala Pro 

ATC TAT 

lie Tyr 

CGT TAT 

Arg Tyr 

CGC CTG 

Aig Leu 
120 

TTC ATC 

Pbe He 
135 

TCG GGC 

Ser Gly 



GTC CTG 
Yal leu 

TTC GTC 

Pbe Val 
90 

ATC CCG 

He Pro 
105 

CAC CAT 

His His 

TAT GCG 
Tyr Ala 

GTG CTG 
Val Leu 



TGG TGG 
Tip Trp 
75 

CTG CAT 
Leu His 

CGC AAG 
Arg Lys 

GCG GTC 

Ala Val 

CCC CCG 
Pro Pro 
140 
CGG GCC 
Arg Ala 
155 



ATC GCC TTG CGC , 240 
He Ala Leu Gly 



GAC GGG 
Asp Gly 

GGC TAT 
Gly Tyr 
110 
GAG GGG 
Glu Gly 
125 

GTC GAC 
Yal Asp 



CTG GTG 
Leu Val 
95 

GCC AGA 
Ala Arg 

CGC GAC 
Arg Asp 

AAG CTG 
Lys Leu 



GAG GCG CAG GAG 
Glu Ala Gin Glu 
160 



288 



336 



384 



432 



480 



489 



60 



55 



26 
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SEQ ID NO: 3 * • * ' 

SEQUENCE LENGTH: 1161 

SEQUENCE TYPE: STRANDEDNESS : double 

TOPOLOGY: linear 

MOLECULE TYPE: genomic DNA 

ORIGINAL SOURCE: 

ORGANISM: Agrobacterium aurantiacus 

STRAIN: sp. ndv. MKl 

SEQUENCE 

GTG ACC GAT GAC GTG CTG CTG GCA GGG GCG GGC CTT GCC AAC GGG CTG 

Mel Thr His Asp til Leu Leu Ala Gly Ala Gly Leu Ala Asn Gly Leu 

1 .•. ' 5 10 15 

ATC GCC CTG GCG CTG CGC GCG GCG CGG CCC GAC CTG CGC GTG CTG CTG 

lie Ala LeuAla Leu Arg Ala Ala kig Pro Asp Leu kig Val Leu Leu 

20 25 30 

CTG GAC CAT GCC GCA GGA CCG TCA GAC GGC CAC ACC TGG TCC. TGC CAC 

Leo Asp His, Ala Ala Gly Pro Ser Asp Gly His Thr Trp Ser Cys His 

35 ■ 40 45 

GAC CCC GAC CTG TCG CCG GAC TGG CTG GCG CGG CTG AAG CCC CTG CGC 

Asp Pro Asp Leu Ser Fro Asp Trp Leu Ala Arg Leu Lys Pro Leu Arg 

50 55 60 



27 



70 



IS 



20 



25 



30 



35 



40 



45 



CGC GCC AAC TGG 

Arg Ale Asn Trp 

65 

CGG CTG GCC ACC 

Arg Leu k\i Tbr 



.GCG GTG 
Ah Val 

CTG CTG 
Leu Leu 

GCG GGC 
Ala Gly 
130 
ACC GTG 
Thr. Yal 
H5 

CCC CAC 
Pro His 



GTC CGG 
Val Atg 
100 
GAT GCG 
Asp Ala 
115 

GCG GTC 
Ala Val 

GGT TTC 
Gly Phe 

GGC GTG 
Gly Yal 



• "C- 

CCC GAC CAG GAG 
Pro Asp Gin Gh 
70 

GGT TAC GGG TCG 
Gly Tyr Gly Ser 

85 

TCG GGC GCC GAG 
Ser Gly Ala Glu 



TiS7 AT 



so 



CAG GAC GGG TAC 

Gin Asp Glj Tyr 
180 



CAG GGG 
Gin Gly 

CTG GAC 
Leu Asp 

CAG AAA 
Gin Lys 
150 
CCC CGC 
Pro Arg 
165 

CGC TTC 
Arg Phe 



GCG ACG 
Ala Tbr 
120 
GGG CGG 
Gly Arg 
135 

TTC GTG 
Phe Val 

CCG ATG 
Pro Met 

ATC TAT 
lie Tyr 



GTG CGC 
Yal Arg 

CTG GAC 
Leu Asp 
80 

ATC CGC 
lie Arg 

105 

CTG TCC 
Leu Ser 

GGC GCG 
Gly Ala 

GGT GTC 
Gly Val 

ATC ATG 
lie Met 

no 

CTG CTG 
Leu Leu 
185 



TTT CCC 
Phe Pro 
75 

GGG GCG 
Gly Ala 

TGG G.AC 
Trp Asp 

TGC GGC 
Cys Gly 

CAG CCG 

Gin Pro 
140 

GAG ATC 

Glu He 
155 

GAC GCG 

Asp Ala 

CCC TTC 
Pro Pbe 



CGC CAT 
Arg His 

GCG CTG 
Ala Leu 

AGC GAC 
Ser Asp 
110 
ACC CGG 
Thr Arg 
125- 

TCG CGG 
Ser Arg 



GCC CGG 
Ala Arg 
80 

GCG GAT 
Ala Asp 
95 

ATC GCC 
lie Ala 

ATC GAG 
lie Glu 

CAT CTG 
His Leu 



GAG ACC GAC CGC 
Glu Tbr Asp Arg 
-160 

ACC GTC ACC CAG 
Tbr Val Tbr Gin 
175 

TCT CCG ACG CGC 
Ser Pro Tbr Arg 
190 



240 



288 



336 



384 



432 



480 



528 



576 



55 



28 



70 



75 



20 



25 



30 



40 



45 



50 
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ATC CTG A7C GAG GAC ACG CGC TAT TCC GAT GGC GCC GAT CTG GAG GAC . 624 
He Leu lie Glu Asp Thr.Arg Tjr Ser Asp Giy Gly Asp Leu Asp Asp 

195 200 205 

GAC GCG CTG GCG GCG GCG TCC CAC GAC TAT GCC CGC CAG CAG GGC TGG 672 
Asp AU Leu Ala AU AU Ser His Asp Tyr Ala Arg Gin Gin Gly Trp 

* 

210 .215 • 220 

ACC GCG GCC GAG GTC CGG CGC GAA CGC GGC ATC CTT CCC ATC GCG CTG 120 
fhr Gly AU. Glu Vz! Arg Arg Glu Arg Gly He Leu Pro lie Ak Leu 
.225 230 235 240 

GCC CAT OAT GCG GCG GGC TTC TGG GCC GAT CAC GCG GC.G GGG CCT GTT ?68 
Ale Ki$ Asp AU AIj Gly Phe Trp Ale Asp His AU AU Gly Pro Vzl 

2-15 ". . 250 .• ■ 255 

CCC GTG GGA CTG CGC GCG GGG TTC TTT CAT GCG GTC ACC GGC TAT TCG 816 
Pro YjI Gly Leu Aig AU Gly Phe Phe Kis Pro Val Thr Gly Tyr Ser 
260 265 ' 270 

^ CTG CCC TAT GCG GCA CAG GTG GCG GAC GTG GTG GCG GGT CTG TCC GGG 864 

Leu Pro Tyr AU AU GU Vel AU Asp Yal Val AU Gly Leu Ser Gly 

215 280 285 

CCG CCC GGC ACC GAC GCG CTG CGC GGC GCC ATC CGC GAT TAC GCG ATC 912 
Pro Pro Gly Thr Asp AU Leu Arg Gly AU He Arg. Asp Tyr AU He 

290 295 300 

GAC CGG GCG CGC CGC GAC CGC TTT CTG CGC CTT TTG AAC CGG ATG CTG 960 
Asp Arg AU Aig Arg Asp Arg Phe Leu Arg Leu Leu Asn Arg Mel Leu 
305 310 315 320 

55 



29 



10 



IS 



so 



ss 



30 



3S 



40 
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TTC CGC GGC TGC GCG CCC GAC CGG CGC TAT ACC CTG CTG CAG CGG TTC 1008 
Phe Afg Gly Cys Ala Pro Asp Arg Arg Tyr Thr Uu Leu Gin Af£ Phe 

325 330 33.5 

TAG CGC ATG CCG CAT GGA CTG ATC GAA CGG TTC TAT GCC GGC CGG CTG 1056 
Tyr Arg Mel Pro His Gly Leu lie GId Arg Phe Tyr Ala Gly Arg Leu 

340 345 ' 350 

AGC GTG GCG GAT CAG CTG CGC ATC GTG ACC GGC AAG CCT CCC ATT CCC 1104 
Ser Val Ala Asp GIn Leu Arg He Ycl Thr'Gly Ly's Pro Pro He Pro 

355 360 ,365 

CTT GGC ACG GCC ATC CGC TGC CTG CCC GAA CGT CCC CTG CTG AAG GAA 1152 
Leu Gly Thr Ala He Arg Cys Leu Pro Glu Arg Pro Leu Leu Lys Glu 

370 375 ■ 380 

AAC CCA TGA 1161 
AsD Ala i*i 
385 



45 



SO 



55 



30 



10 



IS 



20 



25 



35 



40 



45 



SO 
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SEQ ID NO: 4 ...... 

SEQUENCE LENGTH: 2886 

SEQUENCE type': STRANDEDNESS : double 
TOPOLOGY : 1 inear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE: 

ORGANISM: Agrobacterium aurantiacus 
STRAIN: sp. nov. MKl ' 



SEQUENCE - ■ . 

GGATCCGGCG ACCTTGCGQC GCTGCGCCGC GCGCCTTTGC* TGGTGCCTGG CCCGGGTGGC 60 
CCTAGGCCGC TGGAACdCCG GGACGCGGCG CGCGGAAACG ACCACGGACC CGGCGCACCG 



CAATGGTCGC AAGCAACGGG GATGGAAACC GGCGATGCGG GACTGTAGTC TGCGCGGATC 120 
30 GTTACCAGCG TTCGTTGCCC CTACCTTTGG CCGCTACGCC CTGACATCAG ACGCGCCTAG * 



GCCGGTCCGG GGGACAAGAT GAGCGCACAT GCCCTGCCCA AGGCAGATCT GACCGCCACC 180 
CGGCCAGGCC CCCTGTTCTA CTCGCGTGTA CGGGAC6GGT- TCCGTCTAGA CTGGCGGTGG 

AGCCTGATCG TCTCGGGCGG CATCATCGCC GCTTGGCTGG CCCTGCATGT GCATGCGCTG 240 
TCGGACTAGC AGAGCCCGCC GTAGTAGCGG CGAACCGACC GGGACGTACA CGTACGGGAC 

TGGTTTCTGG ACGCAGCGGC GCATCCCATC CTGGCGATCG CAAATTTCCT GGGGCTGACC 300 
ACCAAAGACC TGCGTCGCCG CGTAGGGTAG GACCGCTAGC GTTTAAAGGA CCCCGACTGG 



55 



31 
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T.GGCTGTCGG TCGGATTGTT CATCATCGCG CATGACGCGA TGCACGGGTC GGTGGTGCCG . 360 
ACCGACAGCC AGCCTAACAA GTAGTAGCGC GTACTGCGCT ACGTGCCCAG CCACCACGGC 

GGGCGTCCGC GCGCCAATGC GGCGATGGGC CAGCTTGTCC TG7GGCTCTA TGCCGGATTT 420 
CCCGCAGGCG CGCGGTTACG CCGCTACCCG GTCGAACAGG ACACCGACAT ACGGCCTAAA 

TCGTGGCGCA AGATGATCGT CAAGCACATG GCCCATCACC GCCATGCCGG AACCGACGAC 480 
AGCACCGCGT TCTACTAGCA GTTCGTGTAC CGGGTAGTGG CGGTACGGCC TTGGCTGCTG 

GACCCCGATT TCGACCA7GG CGGCCCGGTC CGCTGGTACG CCCGCTTCAT CGGCACCTAT 540 
CTGGGGCTAA AGCTGGTACC GCCGGGCCAG GCGACCATGC GGGCGAAGTA GCCGTGGATA 

TTCGGCTGGC GCGAGGGGCT GCTGCTGCCC GTCATCGTGA CGGTCTATGC GCTGATCCTT 600 
AAGCCGACCG CGCTCCCCGA CGACGACGGG CAGTAGCACT GCCAGATACG CGACTAGGAA 

GGGGATCGCT GGATGTACGT GGTCTTCTGG CCGCTGCCGT CGATCCTGGC GTCGATCCAG 650 
CCCCTAGCGA CCTACATGCA CCAGAAGACC GGCGACGGCA GCTAGGACCG CAGCTAGGTC 

CTGTTCGTGT TCGGCACCTG GCTGCCGCAC CGCCCCGGCC ACGACGCGTT CCCGGACCGC 720 
GACAAGCACA AGCCG7GGAC CGACGGCG7G GCGGGGCCGG 7GC7GCGCAA GGGCCTGGCG 

CACAA7GCGC GG7CG7CGCG GA7CAGCGAC CCCGTGTCGC 7GC7GACC7G CT7TCACTTT 780 
GTG7TACGCG CCAGCAGCGC C7AG7CGCTG GGGCACAGCG ACGAC7GGAC GAAAG7GAAA 



32 
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CGCGGTTATC A7CACGAACA CCACCTGCAC CCGACGGTGC CGTGGTGGCG CCTGCCCAGC 840 

CCGCCAATAC, TAGTGCTTGT GGTGGACGTG .GGCTGCCACG GCACCACCCC GGAC.GGGTCG 

ACCCGCACCA AGGGGGACAC CGCATGACCA ATTTCCTGAT CG7CGTCGCC ACCGTGCTCG 900 

TGGGCGTGGT TCCCCCTGTG GCGTACTGGT TAAAGGACTA GCAGCAGCGG TGCGACCACC 

TGATGGAGTT GACGGCCTAT TCCGTCCACC GCTGGATCAT GCACGGCCCC CTGGGCTGGG 960 

ACTACCTCAA CTGCCGG-ATA AGGCAGGTGG CGACCTAGTA CGTGCCGGGG GACCCGACCC 

GCTGGCACAA "gTCCCACCAC GAGGAACACG ACCACGCGCT GGAAAAGAAC GACC.T'GTACG 1020 

CGACCGTGTT CAGGGTGGTG CTCCTTGTGC TGGTGCGCGA CCTTTTCTTG CTGGACATGC 

GCCTGGTCTT TGCGGTGATC GGCACGGTGC TGTTCACGGT GGGCTGGATC TGGGCGCCGG 108O 

CGGACCAGAA A.CGCCACTAG CGGTGCCACG ACAAGTGCCA CCCGACCTAG ACCCGCGGCC 

TCCTGTGGTG GATCGCCTTG GGCATGACTG TCTATGGGCT GATCTATTTC 

AGGACACCAC CTAGCGGAAC CCGTACTGAC AGATACCCGA CTAGATAAAG 

ACGGGCTGGT GCATCAGCGC TGGCCGTTCC GTTATATCCC GCGCAAGGGC TATGCCAGAC 1200 

TGCCCGACCA CGTAGTCGCG ACCGGCAAGG CAATATAGGG CGCGTTCCCG ATACGGTCTG 

GCCTGTATCA GGCCCACCGC CTGCACCATG CGGTCGAGGG GCGCGACCAT TGCGTCAGCT 1250 

CGGACATAGT CCGGG7GGCG GACGTGGTAC GCCAGCTCCC CGCGCTGGTA ACGCAGTCGA 



GTCCTGCATG 1140 
CAGGACGTAC 



33 



TCGGCTTCAT CTATGCGCCC CCGGTCGACA AGCTGAA(JCA GGACCTGAAG ATGTCGGGCG 1320 
AGCCGAAGTA GATACGCGGG GGCCAGCTGT TCGACTTCGT CCTGGACTTC TACAGCCCGC 

TGCTGCGGGC CGAGGCGCAG GAGCGCACGT GACCCATGAC GTGCTGCTGG CAGGGGCGGG 1380 
ACGACGCCCG GCTCCGCGTC CTCGCGTG'CA CTGGGTACTG CACGACGACC GTCCCCGCCC 

CCTTGCCAAC GCGCTGATCG CCCTGGCGCT GCGCGCGGCG' CGGCCCGACC TGCGCGTGCT 1440 
GGAACGGTTG CCCGACTAGC GGGACCGCGA CGCGCGGCGC GCGGGGCTGG ACGCGCACGA 

GCTGCTGGAC CATGCCGCAG GACCGTCAGA CGGCCACACC TGGTCCTGCC ACGACCCCtA 1500 
CGACGACCTG GTACGGCCTC CTGGCAGTCT GCCGGTGTGG ACCAGGACGG TGCTGGGGCT 

CCTGTCGCCG GACTGGCTGG CGCGGCTGAA GCCCCTGCGC CGCGCCAACT GGCCCGACCA 1560 
GGACAGCGGC CTGACCGACC GCGCCGACTT CGGGGACGCG GCGCGGTTGA CCGGGCTGGT 

GGAGGTGCGC TTTCCCCGCC ATGCCGGGCG GCTGGCCACC GGTTACGGGT CGCTGGACGG 1620 
CCTCCACGCG AAAGGGGCGG TACGGGCCGC CGACCGGTGG CCAATGCCCA GCGACCTGCC 

GGCGGCGCTG GCGGATGCGG TGGTCCGGTC GGGCGCCGAG ATCCGCTGGG ACAGCGACAT 1680 
CCGCCGCGAC CGCCTACGCC ACCAGGCCAG CCCGCGGCTC TAGGCGACCC TGTCGCTGTA 

CGCCCTGCTG GATGCGCAGG GGGCGACGCT GTCCTGCGGC ACCCGGATCG AGGCGGGCGC 1740 
GCGGGACGAC CTACGCGTCC CCCGCTGCGA CAGGACGCCG TGGGCCTAGC TCCGCCCGCG 
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GGICCTGGAC GGGCGGGGCG CGCAGCCGTC GCGGCATCTG ACCCTGGGTT TCCAGAAAIT. 1800 
CCAGGACCTG CCCGCCCCGC. GCGTCGGCAG CGCCGTAGAC TGGCACCCAA AGGTCTTTAA 

CGTGGGTGTC GAGATCGAGA CCGACCGCCC CCACGGCGTG CCCCGCCCGA TGATCATGGA 1860 
GCACCCACAG CTCTAGCTCT GGCTGGCGGG GGTCCCGCAC GGGGCGGGCT ACTAGTACCT 

CGCGACCGTC ACCCAGCAGG ACGGGTACCG CTTCATCTAT CTGCTGCCCT TCTCTCCGAC 1920 
GCCCJGGCAG .TGGGTCGTCC TGCCCA-TGGC GAAGTAGATA GA-CGAC-GGG-A. ACAGAGGCTG '■ 

GCGCATCCTG ATCGAGGACA CGCGCTATTC CGATGGCGGC GATCTGGACG ACGACGCGCT 1880 
CGCGTAGGAC TAGCTCCTGT GCGCGATAAG GCTACCGCCG CTAGACC7GC TGCTGCGCGA 

GGCGGCGGCG TCCCACCACT ATGCCCGCCA GCAGGGCTGG ACCGGGGCCG AGGTCCGGCG 2040 
CCGCCGCCGC AGGGTGCTGA TACGGGCGGT CGTCCCGACC T.GGCCCCGGC TCCAGGCCGC . 

CGAACGCGGC ATCCTTCCCA TCGCGCTGGC CCATGATGCG GCGGGCTTCT GGGCCGATCA 2100 
G.CTTGCGCCG TAGGAAGGGT AGCGCGACCG GGTACTACGC CGCCCGAAGA CCCGGCTACT 

CGCGGCGGGG CCTGTTCeCG TGGGACTGCG CGCGGGGTTC TTTCATCCGG TCACCGGCTA 2160 
GCGCCGCCCC GGACAAGGGC ACCCTGACGC GCGCCCCAAG AAAGTAGGCC AGTGGCCGAT 



TTCGCTGCCC TATGCGGCAC AGGTGGCGGA CGTGGTCGCG GGTCTGTCCG GGCCGCCCGG 2220 
AAGCGACGGG ATACGCCGTG TCCACCGCCT GCACCACCGC CCAGACAGGC CCGGCGGGCC 



35 
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CACCGACGCG CTGCGCGGCG CCATCCGCGA TTACGCGATC GACCGGGCGC GCCGCGACCG 2280 

GTGGCTGCGC CACGCGCCGC GGTAGGCGCT AATGCGCTAG CTGGCCCGCG CGGCGCTGGC 

CTTTCTGCGC CTTTTGAACC GGATGCTGTT CCGCGGCTGC GCGCCCGACC GGCGCTATAC 2340 

GAAAGACGCG GAAAACTTGG CCTACGACAA GGCGCCGACG CGCGGGCTGG CCGCGATATG 

CCTGCTGCAG CGGTTCTACC GCATGCCGCA TGGACTGATC GAACGGTTCT ATGCCGGCCG 2400 

GGACGACGTCGCCAAGATGG CGTACGGCGT ACCTGACTAG CTTGCCAAGA TACGGCCGGC 

GCTGAGCGTG GCGGATCAGC TGCGCATCGT GACCGGCAAG CCTCCCATTC CCCTTGGCAC 2460 

CGACTCGCAC CGCCTAGTCG ACGCGTAGCA CTGGCCGTTC GGAGGGTAAG gggaaccgtg 

GGCCATCCGC TGCCTGCCCG AACGTCCCCT GCTGAAGGAA AACGCATGAA CGCCCATTCG 2520 

CCGGTAGGCG ACGGACGGGC TTGCAGGGGA CGACTTCCTT TTGCGTACTT GCGGGTAAGC 

CCCGCGGCCA AGACCGCCAT CGTGATCGGC GCAGGCTTTG GCGGGCTGGC CCTGGCCATC 2580 

GGGCGCCGGT TCTGGCGGTA GCACTAGCCG CGTCCGAAAC CGCCCGACCG GGACCGGTAG 

CGCCTGCAGT CCGCGGGCAT CGCCACCACC CTGGTCGAGG CCCGGGACAA GCCCGGCGGG 2640 

GCGGACGTCA GGCGCCCGTA GCGGTGGTGG GACCAGCTCC GGGCCCTGTT CGGGCCGCCC 

CGCGCCTATG TCTGGCACGA TCAGGGCCAT CTCTTCGACG CGGGCCCGAC CGTCATCACC 2700 

GCGCGGATAC AGACCGTGCT AGTCCCGGTA GAGAAGCTGC GCCCGGGCTG GCAGTAGTGG 



36 
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GACCCCGATG CGCTGAAAGA GCTGTGGGCC C.tGACCGGGC AGGACATGGC GCGCGACGTG 2760 

CTGGGGCTAC GCGAGTTTCT CGACACCCGG GACTGGCCCG TCCTGTACCG CGCGCTGCAC 

ACGCTGATGC CGGTCTCGCC CTTCTATCGG CTGATGTGGC CGGGCGGGAA GGTCTTCGAT 2820 

TGCGACTACG GCCAGAGCGG GAAGATAGCC GACTACACCG GCCCGCCCTT CCAGAAGCTA 

TACGTGAACG.AGGCCGATCC AGGGTCTGGG TCTTGCCGTG CCAGGTGAAG CTGTTGCCGT 2880 

ATGCACTTGC KC.GGCTAGG TCCCAGACCC AGAACGGCAC GGTCCACTTC GACAACGGCA 

GGATCC 2886 
CCTAGG 



37 
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SEQ ID NO: 5 

SEQUENCE: LENGTH: 729 

SEQUENCE TYPE: STRANDEDNESS : double 
TOPOLOGY: linear 
MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE: 



ORGAN ISM: Alcaligenes 
STRAIN: sp. PC-1 

SEQUENCE 

ATG TCC GGA CGG AAG CCT GGC ACA ACT GGC GAC ACG ATC GTC AAT CTC 48 
Met Ser Gly Arg Lys Pro Gly Thr Thr Gly Asp Thr lie Vsl Asn Leu 

1,5 10 15 

GGT CTG ACC GCC GCG ATC CTG CTG TGC TGG CTG GTC CTG CAC GCC TTT 96 
Gly Leu Thr Ala Ala He Leu Leu Cy.s Trp Leu Yal Leo His AU Pbe 
20 25 30 

35 ACG CTA TGG TTG CTA GAT GCG GCC GCG CAT CCG CTG CTT GCC GTG CTG 144 

Thr teu Trp Leu Leu Asp Ala Ala Ala His Pro Leu Leu Ala Yal Leu 
40 ,35 40 45 

TGC CTG GCT GGG CTG ACC TGG CTG TCG GTC GGG CTG TTC ATC ATC GCG 192 
Cys Leu Ala Gly Leu Thr Trp Leu Ser Yal Gly Leu Phe He lie Ala 
50 55 60 

so 



38 
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CAT GAC GCA A7G CAC GGG TCC GTG GTG CCG GGG CGG CCG CCC CCC AAT . 240 

His Asp Ah Met Kis Gly Ser Yzl Vzl Pro Gly Arg Pro Aig Ak Asn 

55 . 70 . 75 80 

GCG GCG ATC GGG CAA CTG GCG CTG TGG CTC TAT CCG GGG TTC TCG TGG 288 

Ala Ala lie G!y Gin Leu Ala Leu Trp Leu Tyr Ala Gly Pie Ser Trp 
85 50 - 95 

" . CCC AAG CTG ATC GCG AAG CAC ATG ACG CAT CAC CGG CAC GCC GGC ACC 336 

Pro Lys Lecile Ala Lys His Mel Thr His His Arg His Ala Gly Thr 

100 1.05 110 . 

GAC AAC GAT CCC GAT TTC GGT CAC GGA GGG CCC GTG CGC TGG TAC GGC 384 

Asp Asa Asp Pro Asp Phe Gly His. Gly Gly Pro Val Arg Trp. Tyr Gly 

115 120 .125 

AGC TTC GTC TCC ACC TAT TTC GGC TGG CGA GAG GGA CTG CTG CTA CCG 432 

Ser Phe Val Ser Tbr Tyr Pbe Gly Trp Arg Glu Gly Leo Leu Leu Pro 

130 135 140 

GTG ATC GTC ACC ACC TAT GCG CTG ATC CTG GGC GAT CGC TGG ATG TAT 480 

Val lie Val Thr Tbr Tyr Ala Leu lie Leu Gly Asp Arg Trp Met Tyr 

145., 1 50 1 55 . • 160 

GTC ATC TTC TGG CCG GTC CCG GCC GTT CTG GCG TCG ATC CAGATT TTC 528 

Val lie Pbe Trp Pro Val Pro Ala Val Leu Ala Ser lie Gin lie Phe 

165 no 175 

GTC TTC GGA ACT TGG CTG CCC CAC CGC CCG GGA CAT GAC GAT TTT CCC 576 

-° Val Pbe Gly Tbr Trp Leu Pro His Arg Pro Gly His Asp Asp Phe Pro 

180 185 190 

56 



30 



40 



39 




GAC CGG CAC AAC GCG AGG TCG ACC GGC ATC GGC GAC CCG TTG TCA CTA 624 
s Asp Afs Hi's Asn AU Arg Ser Tbr Glj lie Gly Asp Pro Leu Ser Leu 

195 200 205 

,0 CTG ACC TGC TTC CAT TTC GGC GGC TAT CAC CAC GAA CAT CAC CTG CAT 672 
Leu Thi Cys Phe His Phe Gly Gly Tyr His His Glu His His Leu His 
210 215 220 

IS 

CCG CAT GTG CCG TGG TGG CGC CTG OCT CGT ACA CGC AAG ACC GGA GGC ?20 
Pro His Vz} Pro Trp Trp Arg Leu Pro Arg' Thr Afg Lys Tbr Gly Gly 

20 

225 230 235 ' • 240 

CGC GCA TGA 729 

Arg AUt** 

30 
3S 
40 
45 

SO 
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SEQ ID NO: 6 

SEQUENCE LENGTH: 489 

SEQUENCE TYPE: STRANDEDNESS : double • 
TOPOLOGY: linear 

MOLECULE TYPE: genomic DNA 
ORIGINAL SOURCE: 

ORGANISM: Alcaligenes 
STRAIN: sp. PC-1 



SEQUENCE . 

ATG ACG CAA TTC CTC ATT GTC GIG GCG ACA GTC CT£ GTG ATG GAG CTG ^8 

Met Thr Gin Phe Lea lie Yal Ycl AU Tbr Yal Leu YeI Met Glu Leu 

. •■• ' 5 10 15 

ACC GCC TAT TCC GTC CAC CGC TGG ATT ATG CAC GGC CCC CTA GGC TGG 96 

Thr Ala Tyr Ser Yal His Arg Tip llj Met His Cly Pro Leu Gly Trp 

20 25 30 

GGC TGG CAC AAG TCC CAT CAC GAA GAG CAC GAC CAC GCG TTG GAG AAG 144 

Gly Trp His:Ly$ Ser His His Glu Glu His Asp His Ala Leu Glu Lys 

35 .40 45 

AAC GAC CTC TAC GGC GTC GTC TTC GCG GTG CTG GCG ACG ATC CTC TTC 192 

Asn Asp Leu Tyr Gly Yal Yal Phe Ala Yal Leu Ala Tht lie Leu Phe 
50 55. 60 



50 



41 



.5 T37 A1 



10 



IS 



20 



2S 



30 



SB 



40 



4S 



ACC GTG GGC GCC 

Tbr Yal Gly Ah 
65 

ATG ACG GTC TAT 

Met Thi Ycl Tyi 



CAT CAA 
His Gin 

AGG CTC 
Aig Leu 

CAC TGC 
His Cys 
130 
AAG CAG 
Lys Gin 
145 

CCG TCG 
Pro Sei 



CGC TGG 
Arg Trp 
100 
TAG CAA 
Tyr Gin 
115 

GTC AGC 
Yal Ser 

GAT CTG 
Asp leu 

TGA 



TAT TGG 
Tyr Trp 
10 

GGG TTG 
Gly Leu 
85 

CCG TTT 
Pro Phe 

GCT CAT 

Ala His 

TTC GGC 
Phe Gly 

AAG CGG 
Lys Arg 
150 



TGG CCG GTG 
Trp Pro Yal 

ATC TAT TTC 
lie Tyr Phe 

CGG TAT ATT 
Arg Tyr lie 
105 

CGC CTG CAC 
Arg Leu His 
120 

TTC ATC TAT 
Phe lie Tyr 
135 

TCG GGT GTC 
Ser Gly Yal 



CTG TGG TGG 
Leu Trp Trp 
75 

ATC CTG CAC 
lie Leu His 
90 

CCG CGG CGG 
Pro Arg Arg 

CAC GCG GTC 

His Ala Yiil 

GCC CCA CCC 
Ala Pro Pro 
140 

CTG CGC CCC 
Leo Aig Pro 
155 



ATC GCC 
lie Ala 

GAC GGG 
Asp Gij 

GGC TAT 
Gly Tyr 
110 
GAG GGG 
Glu Gly 
125 

GTG GAC 
Yal Asp 

CAG GAC 
Gin Asp 



CTG GGC . 240 
Leu Gly 
80 

CTT GTG 288 
Leu Yal 
95 

TTC CGC 336 
Phe Arg 

CGG GAC 384 
Arg Asp 

AAG CTG 432 
Lys Leu 

GAG CGT 480 
Glu Arg 
160 

489 
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SEQ ID NO: 7 

SEQUENCE LENGTH: 1631 

SEQUENCE TYPE: STRANDEDNESS : double 

TOPOLOGY: linear 

MOLECULE TYPE: genomic DNA 

ORIGINAL SOURCE: 

ORGANISM: Alcaligenes 
STRAIN: sp. PC-1 

SEQUENCE , 

CTGCAGGCCG GGCCCGGTGG CCAATGGTCG CAACCGGCAG' GAGTGGAACA GGACGGGGGG 60 
GACGTCCGGC CCGGGCCACC GCTTACCAGC. GTTGGCCGTC CTGACCTTGT CCTGCCGCCC 

CCGGTCTAGG CTGTCGCCCT ACGCAGCAGG AGTTTCGGAT GTCCGGACGG AAGCCTGGCA 120 
GGCCAGATCC GACAGCGGGA TGCGTCGTCC TGAAAGCCTA CAGGCCTGCC TTCGGACCGT 

CAACTGGCGA CACGATCGTC AATCTCGGTC TGACCGCCGC GATCCTGCTG TGCTGGCTGG 180 
GTTGACCGCT G^GCTAGCAG TTAGAGCCAG ACTGGCGGCG CTAGGACGAC ACGACCGACC 



TCCTGCACGC CTTTACGCTA TGGTTGCTAG ATGCGGCCGC GCATCCGCTG CTTGCCGTGC 240 
« AGGACGTGCG GAAATGCGAT ACCAACGATC TACGCCGGCG CGTAGGCGACGAACGGCACG 



65 



43 



TGTGCCTGGC TGGGCTGACC TGGCTGTCGG TCGGGC7GTT CATCATCGCG CATGACGCAA ,300 

ACACGCACCG ACCCGACTGG ACCGACAGCC AGCCCGACAA GTAGTAGCGC GTACTCCGTT 

TGCACGGGTC CCTGGTGCCG GGGCGGCCGC GCGCCAATGC GGCGATCGGG CAACTGGCGC 360 

ACGTGCCCAG GCACCACGGC CCCGCCGGCG CGCGGTTACG CCGCTAGCCC GTTGACCGCG 

TGTGGCTCTA TGCGGGGTTC TCGTGGCCCA AGCTGATCGC CAAGCACATG ACGCATCACC 420 

ACACCGAGAT ACGCCCCAAG AGCACCGGGT TCGACTAGCG GTTCGTGTAC TGCGTAGTGG 

GGCACGCCGG CACCGACAAC GATCCCGATT TCGGTCACGG ACGGCCCGTG CGCTGGTACG 480 

CCGTGCGGCC GTGGCTGTTG CTAGGGCTAA AGCCAGTGCC TCCCGGGCAC GCGACCATGC 

GCAGCTTCGT CTCCACCTAT TTCGGCTGGC GAGAGGGACT GCTGCTACCG GTGATCGTCA 540 

CGTCGAAGCA GAGGTGGATA AAGCCGACCG CTCTCCCTGA CGACGATGGC CACTAGCAGT 

CCACCTATGC GCTGATCCTG GGCGATCGCT GGATGTATGT CATCTTCTGG CCGGTCCCGG 600 

GGTGGATACG CGACTAGGAC CCGCTAGCGA CCTACATACA GTAGAAGACC GGCCAGGGCC 

CCGTTCTGGC GTCGATCCAG ATTTTCGTCT TCGGAACTTG GCTGCCCCAC CGCCCGGGAC 660 

GGCAAGACCG CAGCTAGGTC TAAAAGCAGA AGCCTTGAAC CGACGGGGTG GCGGGCCCTG 

ATGACGATTT TCCCGACCGG CACAACGCGA GGTCGACCGG CATCGGCGAC CCGTTGTCAC 120 

TACTGCTAAA AGGGCTGGCC GTGTTGCGCT CCAGCTGGCC GTAGCCGCTG GGCAACAGTG 
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TACTGACCTG CTTCCA7TTC GGC<;GCTA7C ACCACGAACA TCACCTGCAT CCGCATCTGC. 780 

ATGACTGGAC GAAGG7AAAG CCGCCGATAG TGGTGCTTGT AG7GGACGTA GGCGTACACG 

CGTGGTGGCG CCTGCCTCGT ACACGCAAGA CCGGAGGCCG CGCATGACGC AATTCCTCAT 840 

GCACCACCGC GGACGGAGCA TGTGCGTTCT GGCCTCCGGC GCGTACTGCG TTAAGGAGTA 

TGTCGTGGCG ACAG7CCTCG 7GA7GGAGC7 GACCGCC7A7. 7CCG7CCACC GC7GGA77A7 SOO 

ACAGCACCGC 7G7CAGGAGC ACTACC7CGA C7GGCGGA7A AGGCAGG7GG. CGACCTAA7A 

. GCACGGCCCC C7AGGC7GGG GC7GGCACAA G7CCCA7CAC GAAGAGCACG ACCACGCG77 960 

CG7GCCGGGG GA7CCGACCC CGACCGTG7T. CAGGG7AG7G C77CTCG7GC 7GG7GCGCAA 

GGACAAGAAC GACC7C7ACG GCG7CGTC77 CGCGG7GC7G GCGACGATCC 7C77CACCG7 1020 

CC7C77C77G C7GGAGA7GC CCCAGCAGAA GCGCCACGAC CGCTGC7AGG AGAAGTGGCA 

G.CGCGGC7A7 7GG7GGCCGG TGC7G7GGTG GA7CGCCC7G GGCATGACGG 7C7A7GGG77 1080 

'CCCGCGGA7A ACCACCGGCC ACGACACCAC CTAGCGGGAC CCGTAC7GCC AGA7ACCCAA 

GA7C7A777C A7CC7GCACG ACGGGC7TGT GCA7CAACGC 7GGCCG7T7C GG7A7A77CC 1140 

C7AGA7AAAG 7AGGACG7GC 7GCCCGAACA CG7AG77GCG ACCGGCAAAG CCATA7AAGG 

"GCGGCGGGGC 7A777CCGCA GGC7C7ACCA AGC7CA7CGC C7GCACCACG CGG7CGAGGG 1200 

CGCCGCCCCG A7AAAGGCG7 CCGAGA7GG7 ,7CGAG7AGCG GACGTGGTGC GCCAGC7CCC 



45 



EF0 7S5-.^ 



GCGGGACCAC TGCGTCAGCT TCGGCTTCAT CtATGCCC'CA C'CCGTGGACA AGCTGAAGCA 1250 
CGCCCTGGTG ACGCAGTCGA AGCCGAAGTA GATACGGGGT GGGCACCTGT TCGACTTCGT 

GGATCTGAAG CGGTCGGCTG TCCTGCGCCC CCAGGACGAG CGTCCGTCGT GATCTCTGAT 1320 
CCTAGACTTC GCCAGCCCAC AGGACGCGGG GGTCCTGCTC GCAGGCAGCA CTAGAGACTA 

CCCGGCGTGG CCGCATGAAA TCCGACGTGC TGCTGGCAGG GGCCGGCCTT GCCAACGGAC 1380 
GGGCCGCACC GGCGTACTTT AGGCTGCACG ACGACCGTCC CCGGCCGGAA CGGTTGCCTG 

TGATCGCGCT GGCGATCC'GC AAGGCGCGGC CCGACCTTCG CGTGCTGCTG CTGGACCGTG 1440 
ACTAGCGCGA CCGCTAGaCG TTCCGCGCCG GGCTGGAAGC GCACGACGAC GACCTGGCAC 

CGGCGGGCGC CTCGGACGGG CATACTTGGT CCTGCCACGA CACCGATTTG GCGCCGCACT 1500 
GCCGCCCGCG GAGCCTGCCC GTATGAACCA GGACGCTGCT GTGGCTAAAC CGCGGCGTGA 

GGCTGGACCG CCTGAAGCCG ATCAGGCGTG GCGACTGGCC CGATCAGGAG GTGCGGTTCC 1560 
CCGACCTGGC GGACTTCGGC TAGTCCGCAC CGCTGACCGG GCTAGTCCTC CACGCCAAGG 

CAGACCATTC GCGAAGGCTC CGGGCCGGAT ATGGCTCGAT CGACGGGCGG GGGCTGATGC 1620 
GTCTGGTAAG CGCTTCCGAG GCCCGGCCTA TACCGAGCTA GCTGCCCGCC CCCGACTACG 

GTGCGGTGAC C 1631 
CACGCCACTG G 
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Claims 

1 . A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
a methylene group at the 4-position o1 a p-ionone ring into a keto group. 

5 . ' ' 

2. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
- the methylene group at the 4>position of the p-ionone ring into a keto group and having an amino acid sequence 

substantially of amirio acid Nos. 1-212 which is shown in the SEQ ID NO: 1 . 

10 3. A DNA strand hybridizing the DNA strand according to claim 2 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 2. 

4. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
the methylene group at the 4-position of the p-ionone* ring into a keto group and having an amino acid sequence 

75 substantially of amino acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

5. A DNA strand hybridizing the DNA strand according to claim 4 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to clairp 4. 

20 6. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
p-carotene into canthaxanthin by way of echinenone and having an amino acid sequence substantially of amino 
acid Nos. 1 - 212 which is shown in the SEQ. ID NO: 1. 

7. A DNA strand hybridizing the DNA strand according to claim 6 and having a nucleotide sequence which encodes 

25 a polypeptide having an enzyme activity according to claim 6. 

...» • .. 

8. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
p-carotene into canthaxanthin by way of echinenone and having an amino acid sequence substantially of amino 
acid Nos. 1 - 242 which Is shown in the SEQ ID NO: 5. . 

30 . ■ . ■ 

9. A DNA strand hybridizing the DNA strand according to claim 8 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 8. 

10. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
35 the methylene group at the 4-position of the 3-hydroxy-p-ionone ring Into a keto group. 

11. A DNA strand havinjg a nucleotide sequence which encodes a polyp^tide having an enzyme activity for converting 
the methylene group at the 4-position of the 3-hydroxy-p-ionone ring into a keto group and having an amino acid 
sequence substantially of amino acid Nos. 1-212 which Is shown In the SEQ ID NO: 1 . 

40 

12. A DNA strand hybridizing the DNA strand according to claim 1 1 and haying a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 11.. 

13. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
45 the methylene group at the 4-position of the 3-hydroxy-p-ionone ring into a keto group and having an amino acid 

sequence substantially of amino acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

14. A DNA strand hybridizing the DNA strand according to claim 13 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 13. 

50 

1 5. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
zeaxanthin into astaxanthin by way of 4-ketozeaxanthin and having an amino acid sequence substantially of amino 
acid Nos. 1-212 which is shown in the SEQ ID NO: 1 . 

55 16. A DNA strand hybridizing the DNA strand according to claim 15 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 15. 
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cleotide sequence which encodes a polypeptide having an enzyi 



17. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
zeaxanthin into astaxanthin by way ot 4-ketozeaxanthin and having an amino acid sequence substantially of amino 
acid Nos. 1 - 242 which is shown in the SEQ ID NO: 5. 

18. A DNA strand hybridizing the DNA strand according to claim 1 7 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 17. * 

19. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for adding a 
hydroxy! group to the 3-carbon of the 4-ketO'P-ionone ring. 

20. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for adding a 
hydroxyl group to position 3-carbon of the 4-keto-p-ionone ring and having an amino acid sequence substantially 
of amino acid Nos. 1-162 which is shown in the $EO ID NO: 2. 

IS 21 . A DNA strand hybridizing the DNA strand according to claim 20 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 20. 

22. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for adding a 
hydroxyl group to position 3-carbon of the 4-keto-p-ionone ring and having an amino acid sequence substantially 

20 of amino acid Nos. 1-162 which is shown in the SEQ ID NO: 6. 

23. A DNA strand hybridizing the DNA strand according to claim 22 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 22. 

25 24. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
canthaxanthin into astaxanthin by way of phoenicoxanthin and having an amino acid sequence substantially of 
. * amino acid Nos. 1-162 which is shown in the SEQ ID NO: 2. 

25. A DNA strand hybridizing the DNA strand according to claim 24 and having a nucleotide sequence which encodes 
30 a polypeptide having an enzyme activity according to claim 24. 

26. A DNA strand having a nucleotide sequence which encodes a polypeptide having an enzyme activity for converting 
canthaxanthin into astaxanthin by way of phoenicoxanthin and having an amino acid sequence substantially of 
amino acid Nos. 1-162 which Is shown in the SEQ ID NO: 6. 

35 

27. A DNA strand hybridizing the DNA strand according to claim 26 and having a nucleotide sequence which encodes 
a polypeptide having an enzyme activity according to claim 26. 

28. A process for producing a xanthophyll comprising introducing the DNA strand according to any one of claims 1 - 9 
40 into a microorganism having a p-carolene-synthesizing ability, culturing the transformed microorganism in a culture 

medium, and obtaining canthaxanthin or echinenone from the cultured cells. 

29. A process for producing a xanthophyll comprising introducing the DNA strand according to any one of claims 10 - 
18 into a microorganism having a zeaxanthin-syntheslzing ability, culturing the transformed microorganism in a cul- 

45 ture medium, and obtaining astaxanthin or 4-ketozeaxanthin from the cultured cells. 

30. A process for producing a xanthophyll comprising introducing the DNA strand according to any one of claims 19 - 
27 into a microorganism having a canthaxanthin-syntheslzing ability, culturing the transformed microorganism in a 
culture medium, and obtaining astaxanthin or phoenicoxanthin from the cultured cells. 

so 

31 . A process for producing a xanthophyll according to any one of claims 28-30, wherein the microorganism is a bac- 
terium or yeast. 
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A' ■■ 
X 

y 221 246 255 26^ 213 282 

GTG C.-.T GCG CTG TGG TTT CTG GAC GCA GCG GCG CAT CCC ATC CTG GGG ATC GCA 
Met His Ala Leu 7rp ?he Leu Asp Ala Ala Ala His Pro lie Leu Ala lie Ala 

2S1 300 309 318 327 336 

A-ivT TTC CTG GGG CTG ACC TGG CTG TCG GTC GGA 7TG TTC ATC ATC GCG CAT GAC 
Asn ?he Leu Gly Leu Thr Trp Leu Ser Val Gly Leu Phe lie He Ala His A'sp 

345 354 363 372 381 390 

GCG ATG CAC GGG TCG GTG GTG CCG GGG CGT CCG CGC GCC AAT GCG GCG ATG GGG 
Ala Miet His Gly Ser Val Val Pro Gly Arg Pro Arg Ala Asn Ala Ala Met Gly 

399 408 417 426 435 444 

CAG CTT GTC CTG TGG CTG TAT GCC GGA TTT TCG TGG CGC AAG ATG ATC GTC AAG 
Gin Leu Val Leu Trp Leu Tyr Ala Gly Phe Ser Trp Arg Lys Met He Val Lys 

453 462 471 480 489 498 

CAC ATG GCC CAT CAC CGC CAT GCC GGA ACC GAC GAC GAC CCC GAT TTC GAC CAT 
His Met Ala His His Arg His Ala Gly Thr Asp Asp Asp Pro Asp Phe Asp His 

507 516 525 534 543 552 

GGC GGC CCG GTC CGC TGG TAC GCC CGC TTC ATC GGC ACC TAT TTC GGC TGG CGC 
Gly Gly Pro Val Arg Trp Tyr Ala Arg Phe He Gly Thr Tyr Phe Gly Trp Arg 

561 570 579 588 597 606 

GAG GGG CTG CTG CTG CCC GTC ATC GTG ACG GTC TAT GCG CTG ATC CTT GGG GAT 
Glu Gly Leu Leu Leu Pro Val Tie Val Thr Val Tyr Ala Leu He Leu Gly Asp 

615 : . 624 633 642 651 660 

CGC TGG ATG TAC GTG. GTC TTC TGG CCG CTG CCG, TCG ATC CTG GCG TCG ATC CAG 
Arg Trp Met Tyr Val Val Phe Trp Pro Leu Pro Ser He Leu Ala Ser He Gin 

669 678 687 696 705 714 

CTG TTC GTG TTC GGC ACC TGG CTG CCG CAC CGC CCC GGC CAC GAC GCG TTC CCG 
Leu Phe Val Phe Gly Thr Trp Leu Pro His Arg Pro Gly His Asp Ala Phe Pro 

723 732 741 750 759 7 68 

GAC CGC CAC AAT GCG CGG TCG TCG CGG ATC AGC GAC CCC GTG TCG CTG CTG ACC 
Asp Arg His Asn Ala Arg Ser Ser Arg He Ser Asp Pro Val Ser Leu Leu Thr 

777 786 795 804 813 822 

TGC TTT CAC TTT GGC GGT TAT CAT CAC GA.^ CAC CAC CTG CAC CCG ACG GTG CCG 
Cys ?he His ?he Gly Gly Tyr His His Glu His His Leu His Pro Thr Val Pro 

631 840 849 858 867 

TGG TGG CGC CTG CCC AGC ACC CGC ACC i^J^G GGG GAC ACC GCA TGA 
Trp Trp Arg Leu Pro Ser Thr Arg Thr Lys Gly Asp Thr Ala -** 

T 



FIG. I 
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C 

^ B12 861 8S0 8S9 S08 SI"? 

ATG ACC AAT TTC CTG ATC GTC GTC GCC ACC GTG CTG GTG ATG GAG 7TG ACG GCC 

Met Thr Asn Phe Leu lie Val Val Ala Thr Val Leu Val Met Glu Leu Thr Ala 

926 935 ' 953 962 971 

TAT TCC 'GTC CAC CGC TGG ATC ATG CAC GGC CCC CTG GGC TGG GGC TGG CAC AAG 
Tyr Ser Val His Arg Trp He Met Kis Gly Pro Leu Gly Trp Gly Trp His Lys 

580 589 998 100*7 1016 1025 

TCC CAC CAC GAG GAA CAC GAC CAC GCG CTG QPJi AAG AAC GAC CTG TAC GGC CTG 
Ser His His Glu Glu Kis Asp His Ala Leu Glu Lys Asn Asp Leu Tyr Gly Leu 

1034 ' 1043 1052 1061 1070 1079 

GTC TTT GCG GTG ATC GCC ACG GTG CTG TTC ACG GTG GGC TGG ATC TGG GCG CCG 
Val Phe Ala Val He Ala Thr Val Leu Phe Thr Val Gly Trp He. Trp Ala Pro 

1088' ' 1057 1106 1115 1124 1133 

GTC CTG TGG TGG ATC GCC TTG GGC ATG ACT GTC TAT GGG CTG ATC TAT TTC GTC 
Val Leu Trp Trp He Ala Leu Gly Met Thr Val Tyr Gly Leu He Tyr Phe Val 

1142 1151 1160 11^9 1178 1187 

CTG CAT GAC GGG CTG GTG CAT CAG CGC TGG CCG TTC CGT TAT ATC CCG CGC AAG 
Leu His Asp Gly Leu Val Kis Gin Arg Trp Pro Phe Arg Tyr He Pro Arg Lys 

1196 1205 * 1214 1223 1232 1241 

GGC TAT GCC AGA CGC CTG TAT CAG GCC CAC CGC CTG CAC CAT GCG GTC GAG GGG 
Gly Tyr Ala Arg Arg Leu Tyr Gin Ala Kis Arg Leu His His Ala Val Glu Gly 

1260' ' 1259 1268 1277 1286 1295 

CGC GAC CAT TGC GTC AGC TTC GGC TTC ATC TAT GCG CCC CCG GTC GAC AAG CTG 
Arg Asp His Cys val Ser Phe Gly Phe He Tyr Ala Pro Pro Val Asp Lys Leu 

1304 1313 1322 1331 1340 1349 

A;^.G CAG GAC CTG A.-.G ATG TCG GGC GTG CTG CGG GCC GAG GCG CAG GAG CGC ACG 
Lys Gin Asp Leu Lys Met Ser Gly Vai Leu Arg Ala Glu Ala Gin Glu Arg Thr 

TGA ^ 

::: d 
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t 

E ' 

^ 1351 1366 1375 138< " 1393 1402 

GTG ACC CAT GAC GTG CTG CTG GCA GGG GCG GGC CTT GCC AAC GGG CTG ATC GCC 
Met Thr Kis Asp Vel Leu Leu Ala Gly Ala Gly Leu Ala Asn Gly Leu lie Ala 

1411 1420 1429 1433 1447 1456 

CTG GCG CTG CGC GCG GCG CGG CCC GAC CTG CGC GTG CTG CTG CTG GAC CAT GCC 
Leu Ala Leu Arg Ala Ala Arg Pro Asp Leu Arc Val Leu Leu Leu Asp His Ala 

1465 1474 1463 1492 1501 1510 

GCA GGA CCG 7CA GAC GGC CAC ACC TGG TCC TGC CAC GAC CCC GAC CTG TCG CCG 
Ala Gly Pro Ser Asp Gly His Thr Trp Ser Cys His Asp Pro Asp Leu Ser Pro 

1519 1528 . 1537 1546 1555 1564 

GAC TGG CTG GCG CGG CTG AJ^G CCC CTG CGC CGC GCC AAC TGG CCC GAC CAG GAG 
Asp Trp Leu Ala Arg Leu Lys Pro Leu Arg Arg Ala .Asn Trp Pro Asp Gin Glu 

1575 15S2 1591 1600 1605 1618 

GTG CGC TTT CCC CGC CAT GCC CGG CGG CTG GCC ACC GGT TAC GGG TCG CTG GAC 
Val Arg ?he Pro Arg His Ala Arc Arg Leu Ala Thr Gly Tyr Gly Ser Leu Asp 

1627 1636 1645 1654 1663 1672 

GGG GCG GCG CTG GCG GAT GCG GTG GTC CGG TCG GGC GCC GAG ATC CGC TGG GAC 
Gly Ala A.le Leu Ala Asp Ala Val Val Arg Ser Gly Ala Glu lie Arg Trp Asp 

1681 1690 1699 1708 1717 1726 

AGC GAC ATC GCC CTG CTG GAT GCG CAG GGG GCG ACG CTG TCC TGC GGC ACC CGG 
Ser Asp He Ala Leu Leu Asp Ala Gin Gly Ala Thr Leu Ser Cys Gly Thr Arg 

1735 1744 1753 1762 1771 1780 

ATC GAG GCG GGC GCG GTC CTG GAC GGG CGG GGC GCG CAG CCG TCG CGG CAT CTG 
He Glu Ala Gly Ala Val Leu Asp Gly Arg Gly Ala Gin Pro Ser Arg Kis Leu 

1*789 1798 1807 1816 1825 1834 

ACC GTG GGT TTC CAG A-AA TTC GTG GGT GTC GAG ATC GAG ACC GAC CGC CCC CAC 
Thr Vai Gly Phe Gin Lys Phe Val Gly Val Glu He Glu Thr Asp Arg Pro His 

1 843 1852 1 661 IBTO 1879 1888 

GGC GTG CCC CGC CCG ATG ATC ATG GAC GCG ACC GTC ACC CAG CAG GAC GGG TAC 
Gly Val Pro Arg Pro Met He Met Asp Ala Thr Val Thr Gin Gin Asp Gly Tyr 

1 857 1506 1915 1524 1533 1942 

CGC TTC ATC TAT CTG CTG CCC TTC TCT CCG ACG CGC ATC CTG ATC GAG GAC ACG 
Arc Phe lie Tyr Leu Leu Pro Phe Ser Pro Thr Arg He Leu He Glu Asp Thr 

1951 1 560 1 969 iSlS 1567 1996 

CGC TAT TCC GAT GGC GGC GAT CTG GAC GAC GAC GCG CTG GCG GCG GCG TCC CAC 
Arg Tyr Ser Asp Gly Gly Asp Leu Asp Asp Asp Ala Leu Ala Ala Ala Ser His 
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2005 2014 2023 

GAC TAT GCC CGC CAG CAG GGC TGG ACC 
Asp Tyr Ala Arg Gin Gin Gly 7rp Thr 

2059 2066 2077 

ATC CTT CCC ATC GCG CTG GCC CAT GAT 
lie Leu Pro lie Ala Leu Ala His Asp 

2113 2122 2131 

GCG GGG CCT GTT CCC GTG GGA CTG CGC 
Ala Gly Pro Val Pro Val Gly Leu Arg 



2032 2041 2050 

GGG GCC GAG GTC CGG CGC GAA CGC GGC 
Gly Ala Glu Val Arg Arg Glu Arg Gly 

2066 2095 2104 

GCG GCG GGC TTC TGG GCC GAT CAC GCG 
Ala Ala Gly Phe Trp Ala Asp His Ala 

2140 2149 2158 

GCG GGG TTC TTT CAT CCG GTC ACC GGC 
Ala Gly Phe Phe His Pro Val Thr Gly 



2167 2176 
TAT TCG CTG CCC TAT GCG 
Tyr Ser Leu Pro Tyr Ala 

2221 2230 
CCG CCC GGC ACC GAC GCG 
Pro Pro Gly Thr Sksf) Ala 

2275 2284 
GCG CGC CGC GAC CGC TTT 
Ala Arg Arg Asp Arg Phe 

2329 2338 
GCG CCC GAC CGG CGC TAT 
Ala Pro Asp Arg Arg Tyr 

2383 2392 
CTG ATC GAA CGG TTC TAT 
Leu lle^Glu Arg Phe Tyr 

2437 2446 
GTG ACC GGC AAG CCT CCC 
Val Thr Gly Lys Pro Pro 

2491 2500 
CGT CCC CTG CTG AAG GAA 
Arg Pro Leu Leu Lys Glu 



2185 2194 
GCA CAG GTG GCG GAC GTG 
Ala Gin Val Ala A?p Val 

2239 2248 
CTG CGC GGC GCC ATC CGC 
Leu Arg Gly Ala lie Arg 

2293 2302 
CTG CGC CTT TTG AAC CGG 
Leu Arg Leu Leu Asn Arg 

2347 2356 
ACC CTG CTG CAG CGG TTC 
Thr Leu Leu Gin Arg Phe 

2401 2410 

GCC GGC CGG CTG AGC GTG 
Ale Gly Arg Leu Ser Val 

2455 2464 
ATT CCC CTT GGC ACG GCC 
lie Pro Leu Gly Thr Ala 

2509 
AAC GCA TGA 
Asn Ala 

F 



2203 2212 
GTG GCG GGT CTG TCC GGG 
Val Ala Gly Leu Ser Gly 

2257 2266 
GAT TAC GCG ATC GAC CGG 
Asp Tyr Ala lie Asp Arg 

2311 2320 
ATG CTG TTC CGC GGC TGC 
Met Leu Phe Arg Gly Cys 

2365 2374 
TAC CGC ATG CCG CAT GGA 
Tyr Arg Met Pro His Gly 

2419 2428 

GCG GAT CAG CTG CGC ATC 
Ala Asp Gin Leu Arg lie 

2473 2482 
ATC CGC TGC CTG CCC GAA 
lie Arg Cys Leu Pro Glu 
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10 20 30 40 50 60 

• ■ • • • , • . 

GGATC CGGCG ACCTT GCGGC GCTGC GCCGC GCGCC TTTGC TGGTG CCTGG GCCGG GTGGC * 
CCTAG GCCGC TGGAA CGCCG CGACG CGGCG CGCGG AAACG ACCAC GGACC CGGCC CACCG 

70 80 50 100. aiC 120 

« . • • ■ • « » 

CAATG GTCGC AAGCA ACGGG GATGG AAACC GGCGA TGCGG GACTG TAGTC TGCGC GGATC 
GTTAC CAGCG 7TCGT TGCCC CTACC TTTGG CCGCT ACGCC CTGAC ATCAG AC-GCG CCTAG 

130 lAO 150 160 170 180 

» • * » • » 

GCCGG TCCGG GGGAC AAGAT GAGCG CACAT GCCCT GCCCA AGGCA GATCT GACCG CCACC 
CGGCC AGGGC CCCTG TTCTA CTCGC GTGTA CGGGA CGGGT TCCGT CTAGA CTGGC GGTGG 

190 200 210 220 A 230 240 

AGCCT GATCG TCTCG GGCGG CATCA TCGCC GCTTG GCTGG CCCTG CATCT GCATG CGCTG 
TCGGA CTAGC AG AGO CCGCC GTAGT AGCGG CGAAC CGAGC GGGAC GTACA CGTAC dCGAC 

250 '260 270 280 290 300 

• »■ ■ ' * * * » 

TGGTT TCTGG ACGCA GCGGC GCA7C CCA7C CTGGC. GATCG CAAAT TTCCT GGGGC TGACC 
ACCAA AGACC TGCGT.^CGCCG CGTAG GGTAG GACCG CTAGC GTTTA AAGGA CCCCG ACTGG 

310 320 .. 330 . 340 350 360 

« • • • • « ■ 

TGGCT GTCGG TCGGA TTGTT CATCA TCGCG CATGA CGCGA TGCAC GGGTC GGTGG TGCCG 
ACCGA CAGCC AGCCT AAG^J^ GTAGT AGCGC GTACT GCGCT ACGTG CCCAG CCACC ACGGC 

370 380 390 400 410 420 

• " . • • • " 
GGGCG TCCGC GCGCC AATGC GGCGA TGGGC CAGCT TGTCC TGTGG CTGTA TGCCG GATTT 
CCCGC AGGCG CGCGG TTACG CCGCT ACCCG GTCGA ACAGG ACACC GACAT . ACGGC CTAAA 

430 . .. 440 450 460 470 • 480 

' * * • * * 

TCGTG GCGCA AGATG ATCGT CAAGC ACATG GCCCA TCACC GCCAT GCCGG AACCG ACGAC 

AGCAC CGCGT TCTAC TAGCA GTTCG TGTAC CGGGT AGTGG CGGTA CGGCC TTGGC TGCTG 

490 500 510 520 530 540 

" » • • ♦ 

GACCC CGATT TCGAC CATGG CGGCC CGGTC CGCTG GTACG CCCGC TTCAT CGGCA CCTAT 
CTGGG GCTAA AGCTG GTACC GCCGG GCCAG GCGAC CATGC GGGCG AAGTA GCCGT GGATA 

550 560 570 5S0 590 600 

• • • . » 

TTCGG CTGGC GCGAG GGGCT GCTGC TGCCC GTCAT CGTGA CGGTC TATGC GCTGA TCCTT 

AAGCC GACCG CGCTC . CCCGA CGACG ACGGG CAGTA GCACT GCCAG ATACG CGACT AGGAA 

610 520 630 640 650 660 

• • • * « 

GGGGA TCGCT GGATG TACGT GGTCT TCTGG CCGCT GCCGT CGATC CTGGC GTCGA TCCAG 

CCCCT AGCGA CCTAC ATGCA CCAGA AGACC GGCGA CGGCA GCTAG GACCG CAGCT AGGTC 
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^■'O 680 6S0 700 710 , 720 

CTGTT CGTGT TCGGC ACC7G GC7GC CGCAC CGCCC CGGCC ACGAC GCGTT CCCGG ACCGC 

GACAA GCACA AGCCG TGGAC CGACG GCGTG GCGGG GCCGG TGCTG CGCAA GGGCC 7GGCG 

■730 "7<0 750 760 770 780 
•••••• 

CACAA TGCGC GGTCG TCGCG GATCA GCGAC CCCGT GTCGC TGCTG ACCTG CTTTC ACTTT 

GTGTT ACGCG CCAGC AGCGC CTAGT CGCTG GGGCA CAGCG ACGAC TGGAC GAAAG TGAAA 

7S0 600 810 820 830 8<0 

» * » • • 

GGCGG TTATC ATCAC GAACA CCACC TGCAC CCGAC GGTGC CGTGG TGGCG CCTGC CCAGC 

CCGCC AATAG TAGTG CTTGT GGTGG ACGTG GGCTG CCACG GCACC ACCGC GGACG GGTCG 

850 B60 G. 870 880 890 900 



ACCCG CACCA AGGGG GACAC CG^.T GACCA ATTTC CTGAT CGTCG TCGCC ACCGT GCTGG 
TGGGC GTGGT TCCCC CTGTG ^^^'^ CTGGT TAAAG GACTA GCAGC AGCGG TGGCA CGACC 

920 g 930 940 950 960 

» • • » ♦ » 

TGATG GAGTT GACGG CCTAT TCCGT CCACC GCTGG ATCAT GCACG GCCCC CTGGG CTGGG 
ACTAC CTCA-A CTGCC GGATA AGGCA GGTGG CGACC TAGTA CGTGC CGGGG GACCC GACCC 

5'?0 980 990 1000 1010 1020 

* * « » * « 

GCTGG CACAA GTCCC ACCAC GAGGA ACACG ACCAC GCGCT GGAAA AGAAC GACCT GTACG 
CGACC GTGTT CAGGG TGGTG CTCCT TGTGC TGGTG CGCGA CCTTT TCTTG CTGGA CATGC 

1030 1040 1050 1060 1070 1080 

* * • » » * 
GCCTG GTCTT TGCGG TGATC GCCAC GGTGC TGTTC ACGGT GGGCT GGATC TGGGC GCCGG 
CGGAC CAGAA ACGCC ACTAG CGGTG CCACG ACAAG TGCCA CCCGA CCTAG ACCCG CGGCC 

' 1050 liOO 1110 1120 1130 1140 

TCCTG TGGTG GATCG CCTTG GGCAT GACTG TCTAT GGGCT GATCT "ATTTC GTCCT GCATG 
AGGAC ACCAC CTAGC GGAAC CCGTA CTGAC AGATA CCCGA CTAGA TAAAG CAGGA CGTAC 

1150 1160 1170 1180 1190 1200 

ACGGG CTGGT GCATC AGCGC TGGCC GTTCC GTTAT ATCCC GCGCA AGGGC TATGC CAGAC 
TGCCC GACCA CGTAG TCGCG ACCGG CA^.GG CAATA TAGGG CGCGT TCCCG ATACG G7CTG 

^210 1220 1230 1240 1250 1260 

GCC7G TATCA GGCCC ACCGC C7GCA CCA7G CGG7C GAGGG GCGCG ACCA7 7GCG7 CAGC7 
CGGAC ATAG7 CCGGG TGGCG GACGT GG7AC GCCAG CTCCC CGCGC 7GG7A ACGCA G7CGA 

^^'^O 1280 1290 1300 1310 1320 

TCGGC TTCAT CTATG CGCCC CCGGT CGACA AGC7G AAGCA GGACC TGX^vG A7G7C GGGCG 
AGCCG AAG7A GA7AC GCGGG GGCCA GCTGT TCGAC 7TCGT CC7GG AC7TC TACAG CCCGC 
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1330 13^0 Fi350 1360 1370.; . 1360 

TGCTG CGGGC CGAGG CGCAG GAGCG CACGT Gr.CCC ATGAC GTGC7 GCTGG CAGGG GCGGG 
r.CGAC GCCCG GCTCC GCGTC CTCGC GTGCA CTGGG TAC7G CACGA CGACC G7CCC CGCCC 

An 

1390 KOO 1410>-^ 1^20 1^30 1440 

CCTTG CCAAC GGGCT GATCG CCCTG GCGCT GCGCG CGGCG CGGCC CGACC TGCGC GTGCT 
GGAAC GGTTG CCCGA CTAGC GGGAC CGCGA CGCGC GCCGC GCCGG GCTGC ACGCG CACGA 

1450 1460 i<70 1«S0 1490- 1500 

, • . 

GCTGC TGGAC CATGC CGCAG GACCG TCAGA CGGCC ACACC TGGTC CTGCC ACGAC CCCGA 
CGACG ACCTG GTACG GCGTC CTGGC AGTCT GCCGG TGTGG ACCAG GACGG TGCTG GGGCT 

1510 "1520 1530 1540 1550 1560 

» . • • - • 

CCTGT CGCCG GACTG GCTGG CGCGG CTGAA GCCCC TGCGC CGCGC CA*.ACT GGCCC GACCA 

GGACA GCGGC CTGAC CGACC GCGCC GACTT CGGGG ACGCG GCGCG GTTGA CCGGG CTGGT 

1570 1560 1590 1600 1610 1620 

V * • * . * • 

GGAGG TGCGC TTTCC CCGCC ATGCC CGGCG GCTGG CCACC GGTTA CGGGT CGCTG GACGG 
CCTCC ACGCG AAAGG GGCGG 7ACGG GCCGC CGACC. GGTGG CCAAT GCCCA GCGAC CTGCC" 

1630 1640 1650 1660 1670 1630 

* « . * » • • 

GGCGG CGC7G GCGGA 7GCGG TGG7C CGG7C GGGCG CCGAG A7CCG C7GGG ACAGC GACAT 
CCGCC GCGAC CGCC7 ACGCG ACCAG GCCAG CCCGC GGC7C TAGGC GACCC 7G7CG CTG7A 

1690 1700 1710 1720 1730 1740 

« • • • ♦ * ■ * 

CGCCC 7GC7G GA7GC GCAGG GGGCG ACGC7 G7CC7 GCGGC ACCCG GA7CG AGGCG GGCGC 
GCGGG ACGAC C7ACG CG7CC CCCGC 7GCGA CAGGA CGCCG 7GGGC C7AGC 7CCGC CCGCG 

1750 1760 1770 17B0 1790 1800 

GG7CC 7GGAC GGGCG GGGCG CGCAG CCG7C GCGGC A7C7G ACCGT GGG77 7CCAG At.A77 
CCAGG ACC7G CCCGC CCCGC GCGTC GGCAG CGCCG 7AGAC TGGCA CCCAA.AGGTC TTTAA 

leiO 1820 1830 1840 1850 1860 

'» « m • * . * 

CG7GG GTG7C GAGAT CGAGA CCGAC CGCCC CCACG GCG7G CCCCG CCCGA 7GA7C ATGGA 
GCACC CACAG C7C7A GC7C7 GGC7G GCGGG GG7GC CGCAC GGGGC GGGCT ACTAG TACCT 

1870 1580 1890 1900 1910 1920 

« . • « * « * 

CGCGA CCGTC ACCCA GCAGG ACGGG TACCG CT7CA TCTA7 CTGCT GCCCT TC7C7 CCGAC 
GCGCT GGCAG TGGGT CGTCC TGCCC ATGGC GAAGT AGA7A GACGA CGGGA AGAGA GGC7G 

1930 1940 1950 1960 1970 ' 1980 

m • » • » » 

GCGCA 7CC7G ATCGA GGAC.-. CGCGC 7A7TC CGA7G GCGGC GATC7 GGACG ACGAC GCGC7 
CGCGT AGGAC 7AGC7 CC7G7 GCGCG ATAAG GCTAC CGCCG CTAGA CCTGC TGCTG CGCGA 
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2000 2010 * 2020 2030 2040 

• • » • • » 

GGCGG CGGCG TCCCA CGACT ATGCC CGCCA GCAGG GCTGG ACCGG GGCCG AGGTC CGGCG 

CCGCC GCCGC AGGGT GCTGA 7ACGG GCGGT CGTCC CGACC TGGCC CCGGC TCCAG GCCGC 

2050 2060 20T0 2080 2090 2100 

• « • * • » 

CGAAC GCGGC A7CCT TCCCA TCGCG CTGGC CCATG ATGCG GCGGG CTTCT GGGCC GATCA 

GCTTG CGCCG TAGGA AGGGT AGCGC GACCG GGTAC TACGC CGCCC GPJiGk CCCGG CTAGT 

2110 2120 2130 2140 2150 2160 

• • » • » w 

CGCGG CGGGG CCTGT TCCCG TGGGA CT6CG CGCGG GGTTC TTTCA TCCGG TCACC GGCTA 

GCGCC GCCCC GGACA AGGGC ACCCT GACGC GCGCC CCAAG AAAGT AGGCC AGTGG CCGAT 

2170 2180 21S0 2200 2210 2220 

TTCGC TGCCC TATGC GGCAC AGGTG GCGGA CGTGG TGGCG GGTCT GTCCG GGCCG CCCGG 

AAGCG ACGGG ATACG CCGTG TCCAC CGCCT GCACC ACCGC CCAGA CAGGC CCGGC GGGCC 



2230 2240 2250 2260 2270 2280 

• * • * • • m 

CACCG ACGCG CTGCG/.CGGCG CCATC CGCGA TTACG CGATC GACCG GGCGC GCCGC GACCG 
GTGGC TGCGC GACGC GCCGC GGTAG GCGCT AATGC GCTAG CTGGC CCGCG CGGCG CT^GC 

2290 2300 2310 2320 2330 2340 

C7TTC TGCGC CTTTT GA-^XC GGATG CTGTT CCGCG GCTGC GCGCC CGACC GGCGC TATAC 
GA-AAG ACGCG GAAAA CTTGG CCTAC GACA.A GGCGC CGACG CGCGG GCTGG CCGCG ATATG 

2350 2360 2370 2380 2390 2400 

CCTGC TGCAG CGGTT CTACC GCATG CCGCA TGGAC TGATC GAACG GTTCT ATGCC GGCCG 

GGACG ACGTC GCCAA GATGG CGTAC GGCGT ACCTG ACTAG CTTGC CAAGA TACGG CCGGC 

If • '* 

2410' 2420 2430 2440 2450 2460 

* - * * • 

GCTGA GCGTG GCGGA TCAGC TGCGC ATCGT GACCG GCA-AG CCTCC CATTC CCCTT GGCAC 

CGACT CGCAC CGCCT AGTCG ACGCG TAGCA CTGGC CGTTC GGAGG GTAAG GGGAA CCGTG 

2470 2480 2490 2500 2510 2520 

* • • 

GGCCA TCCGC TGCCT GCCCG AA.CGT CCCCT GCTGA AGGAA AACGC ATGAA CGCCC ATTCG 

CCGGT AGGCG ACGGA CGGGC TTGCA GGGGA CGACT TCCTT TTGCG TACTT GCGGG TAAGC 

2^30 2540 2550 2560 r^570 2580 

CCCGC GGCCA AGACC GCCAT CGTGA TCGGC GCAGG CTTTG GCGGG CTGGC CCTGG CCATC 
GGGCG CCGGT TCTGG CGGTA GCACT AGCCG CGTCC GA.A.AC CGCCC GACCG GGACC GGTAG 

2550 2600 2510 2620 2630 2640 

CGCCT GCAGT CCGCG GGCAT CGCCA CCACC CTGGT CGAGG CCCGG GACAA GCCCG GCGGG 
GCGGA CGTCA GGCGC CCGTA GCGGT GGTGG GACCA GCTCC GGGCC CTGTT CGGGC CGCCC 
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2650 2660 2670 2680 26S0 2700 

*' « * *. • ** 

CGCGC CTATG TCTGG CACGA'TCAGG GCCAT CTCT7 CGACG CGGGC CCGAC CGTCA TCACC 
GCGCG GATAC AGACC GTGCT AGTCC CGGTA GAGAA GCTGC GCCCG GGCTG GCAGT AGTGG 

2710 2720 2730 2T40 ' 2750 2760 

• . * .* * , ' • ' * 
GACCC CGATG CGCTG AAAGA GCTGT GGGCC CiGAC CGGGC AGGAC ATGGC GCGCG ACGTG 
CTGGG GCTAC GCGAC TTTCT CGACA CCCGG GACTG GCCCG TCCTG TACCG CGCGC 7GCAC 

2770 . 2*780 2190 2800 2810 282^ 

• * * * * ♦ 

ACGC7 GA7GC CGGTC TCGCC C7TCT ATCGG CTGA7 GTiGGC CGGGC GGGAA GG7C7 7CGA7 
7GCG;A C7ACG GCCAG AGCGG GAAGA 7AGCC GAC7A CACCG GCCCG CCC7T CCAGA AGC7A 

2630 2840 2850 2860 28*?0 2BS0 

• » . .• » • ' • . * 

7ACG7 GAACG AGGCC GA7CC AGGG7 C7GGG 7C77G CCG7G CCAGG 7GAAG C7GTT GCCG7 
A7GCA C77GC 7CCGG C7AGG 7CCCA GACCC AGAAC GGCAC GG7CC AC77C GACAA CGGCA 

■2886 
* 

GGA7C C ■ * 
CC7AG G 
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110 120 130 140 150 

ATGTCCGGACGGAAGCCTGGCACAACTGGCGACACGATCGTCAATCTCGGTCTGACCGCC 
1 MetSerGlyArgLysProGlyThrThrGlyAspThrlleValAsnLeuGlyLeuThrAla 

^^0* I'^O 180 190 200 210 

GCGATCCTGCTGTGCTGGCTGGTCCTGCACGCCTTTACGCTATGGTTGCTAGATGCGGCC 
21 AlalleLeuLeuCysTrpLeuValLeuHisAlaPheThrLeuTrpLeuLeuAspAlaAla 

220 230 240 250 260 270 

GCGCATCCGCTGCTTGCCGTGCTGTGCCTGGCTGGGCTGACCTGGCTGTCGGTCGGGCTG 
41 AlaHisProLeuLeuAlaValLeuCysLeuAlaGlyLeuThrTrpLeuSerValGlyLeu 

. 290 300 . 310 320 330 

TTCATCATCGCGCATGACGCAATGCACGGGTCCGTGGTGCCGGGGCGGCCGCGCGCCAAT 
61 PhellelleAlaHisAspAlaMetHisGlySerValValProGlyArgProArgAlaAsn 

^"^^ 350 360 370 380 390 

GCGGCGATCGGGCAACTGGCGCTGTGGCTCTATGCGGGGTTCTCGTGGCCCAAGCTGATC 
81 AlaAlalleGlyGlnLeuAlaLeuTrpLeuTyrAlaGlyPheSerTrpProLysLeuIle 

410 420 . 430 440 450 

GCCAAGCACATGACGCATCACCGGCACGCGGGCACCGACAACGATCCCGATTTCGGTCAC 
101 AlaLysHisMetThrHisHisArgHisAlaGlyThrAspAsnAspProAspPheGlyHis 

470 480 490 500 510 

GGAGGGCCCGTGCGCTGGTACGGCAGCTTCGTCTCCACCTATTTCGGCTGGCGAGAGGGA 
121 GlyGlyProValArgTrpTyrGlySerPheValSerThrTyrPheGlyTrpArgGluGly 

530 540 550 560 570 

CTGCTGCTACCGGTGATCGTGACCACCTATGCGCTGATCCTGGGCGATCGCTGGATGTAT 
14.1 LeuLeuLeuProVallleValThrThrTyrAlaLeuIleLeuGlyAspArgTrpMetTyr 

.590 600 610 620 630 

GTCATCTTCTGGCCGGTCCCGGCCGTTCTGGCGTCGATCCAGATTTTCGTCTTCGGAACT 
161 ValllePheTrpProValProAlaValLeuAlaSerlleGlnllePheValPheGlyThr 

650 660 670 680 690 

TGGCTGCCCCACCGCCCGGGACATGACGATTTTCCCGACCGGCACAACGCGAGGTCGACC 
181 TrpLeuProHisArgProGlyHisAspAspPheProAspArgHisAsnAlaArgSerThr 

'^^^ .• "^10 720 730 740 . 750 

GGCATCGGCGACCCGTTGTCACTACTGACCTGCTTCCATTTCGGCGGCTATCACCACGAA 
201 GlylleGlyAspProLeuSerLeuLeuThrCysPheHisPheGlyGlyTyrHisHisGlu 
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760 770 780 790 800 810 

CATCACCTGGAtCCGCATGTGCCGTGGTGGCGCCTGCCTCGTACACGCAAGACCGGAGGC 
221 HisHisLeuHisProHisValProTrpTrpArgLeuProArgThrArgLysThrGlyGly 

820 827 
CGCGCATGA 
241 ArgAla*** 

Ib 
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c 

830 840 850 860 870 880 

ATGACGCAATTCCTCATTGTCGTGGCGACAGTCCTCGTGATGGAGCTGACCGCCTATTCC 
1 MetThrGlnPheLeuIleValValAlaThrValLeuValMetGluLeuThrAlaTyrSer 



• 890. 900 910 920 930 940 

GTCCACCGCTGGATTATGC ACGGCCCCCT AGGCTGGGGCTGGC ACAAGTCCC ATC ACGAA 
21 ValHisArgTrpIleMetHisGlyProLeuGlyTrpGlyTrpHisLysSerHisHisGlu 

950 960 970 980 990 1000 

GAGCACGACCACGCGTTGGAGAAGAACGACCTCTACGGCGTCGTCTTCGCGGTGCTGGCG 
41 GIuHisAspHisAlal-euGluLysAsnAspLeuTyrGlyValValPheAlaValLeuAla' 

1010 1020 1030 1040 1050 . 1060 

ACGATCCTCTTCACCGTGGGCGCCTATTGGTGGCCGGTGCTGTGGTGGATCGCCCTGGGC 
61 ThrlleLeuPlieThrValGlyAlaTyrTrpTrpProValLeuTrpTrpIleAlaLeuGly 

1070 1080 1090 1100 1110 1120 

ATGACGGTCTATGGGTTGATCTATTTCATCCTGCACGACGGGCTTGTGCATCAACGCTGG 
81 MetThrValTyrGlyLeuIleTyrPhelleLeuHisAspGlyLeuValHisGlnArgTrp 

ilBO 1140 1150 1160 1170 1180 

CCGTTTCGGTATATTCCGCGGCGGGGCTATTTCCGCAGGCTCTACCAAGCTCATCGCCTG 
101 ProPheArgTyrlleProArgArgGlyTyrPheArgArgLeuTyrGlnAlaHisArgLeu 

1190 1200 1210 1220 1230 1240 

CACCACGC'GGTCGAGGGGCGGGACCACTGCGTCAGCTTCGGCTTCATCTATGCCCCACCC 
121 HisHisAlaValGluGlyArgAspHisCysValSerPheGlyPhelleTyrAlaProPro 

1250 1260. 1270 1280 1290 1300 

GTGGACAAGCTGAAGCAGGATCTGAAGCGGTCGGGTGTCCTGCGCCCCCAGGACGAGOGT 
141 ValAspLysLeuLysGlnAspLeuLysArgSerGlyValLeuArgProGlnAspGluArg 

1312 
CCGTCGTGA 
L61 ProSer*** 
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10 20 30 40 50 60 

CTGCA GGCCG GGCCC GGTGG CCAAT GGTCG CAACC GGCAG GACTG GAACA GGACG GCGGG 
GACGT CCGGC CCGGG CCACC GGTTA CCAGC GTTGG CCGTC CTGAC CTTGT CCTGC CGCCC 



70 80 90 ^ 110 120 

CCGGT CTAGG CTGTC GCCCT ACGCA GCAGG AGTTT CGGAT GTCCG GACGG AAGCC TGGCA 
GGCCA GATCC GACAG CGGGA TGCGT CGTCC TCAAA GCCTA CAGGC CTGCC TTCGG ACCGT 



130 140 .150 160 170 180 

CAACT GGCGA CACGA TCGTC AATCT CGGTC TGACC GCCGC GATCC TGCTG TGCTG GCTGG 
GTTGA CCGCT GTGCT AGCAG TTAGA GCCAG ACTGG CGGCG CTAGG ACGAC ACGAC CGACC 

190 200 210 220 230 240 

TGCTG CACGC CTTTA CGCTA TGGTT GCTAG ATGCG GCCGC GCATC CGCTG CTTGC CGTGC 
AGGAC GTGCG GAAAT GCGAT ACCAA CGATC TACGC CGGCG CGTAG GCGAC GAACG GCACG 

250 260 270 280 290 300 

TGTGC CTGGC TGGGC TGACC TGGCT GTCGG TCGGG CTGTT CATCA TCGCG CATGA CGCAA 
ACACG GACCG ACCCG ACTGG ACCGA CAGCC AGCCC GACAA GTAGT AGCGC GTACT GCGTT 

310 320 330 340 350 360 

TGCAC GGGTC CGTGG TGCCG GGGCG GCCGC GCGCC AATGC GGCGA TCGGG CAACT GGCGC 
ACGTG CCCAG GCACC ACGGC CCCGC CGGCG CGCGG TTACG CCGCT AGCCC GTTGA CCGCG 

370 380 390 400 410 420 

TGTGG CTCTA TGCGG GGTTC TCGTG GCCCA AGCTG ATCGC CAAGC ACATG ACGCA TCACC 
ACACC GAGAT ACGCC CCAAG AGCAC CGGGT TCGAC TAGCG GTTCG TGTAC TGCGT AGTGG 

430 440 450 460 470 480 

GGCAC GCCGG CACCG ACAAC GATCC CGATT TCGGT CACGG AGGGC CCGTG CGCTG GTACG 
CCGTG CGGCC GTGGC TGTTG CTAGG GCTAA AGCCA GTGCC TCCCG GGCAC GCGAC CATGC 

490 500 510 520 530 540 

GCAGC TTCGT CTCCA CCTAT TTCGG CTGGC GAGAG GGACT GCTGC TACCG GTGAT CGTCA 
CGTCG AAGCA GAGGT GGATA AAGCC GACCG CTCTC CCTGA CGACG ATGGC CACTA GCAGT 

550 560 570 580 590 600 

CCACC TATGC GCTGA TCCTG GGCGA TCGCT GGATG TATGT CATCT TCTGG CCGGT CCCGG 
GGTGG ATACG CGACT AGGAC CCGCT AGCGA CCTAC ATACA GTAGA AGACC GGCCA GGGCC 

610 620 630 640 650 660 

CCGTT CTGGC GTCGA TCCAG ATTTT CGTCT TCGGA ACTTG GCTGC CCCAC CGCCC GGGAC 
GGCAA GACCG CAGCT AGGTC TAAAA GCAGA AGCCT TGAAC CGACG GGGTG GCGGG CCCTG 

670 680 690 700 710 720 

ATGAC GATTT TCCCG ACCGG CACAA CGGGA GGTCG ACCGG CATCG GCGAC CCGTT GTCAC 
TACTG CTAAA AGGGC TGGCC GTGTT GCGCT CCAGC TGGCC GTAGC CGCTG GGCAA CAGTG 
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"730 740 750 ' 760 770 7B0. 

TACTG ACCTG CTTCC ATTTC GGCGG CTATC ACCAC GAACA TCACC TGCAT CCGCA TGTGC 
ATGAC TGGAC GAAGG TAAAG CCGCC GATAG TGGTG CTTGT AGTGG ACGTA GGCGT ACACG 

CI 

790 800 810 820 f 830 840 

CGTGG TGGCG CCTGC CTCGT ACACG CAAGA CCGGA GGCCG CGCAT GACGC AATTC CTCAT 
GCACC ACCGC GGACG GAGCA TGTGC GTTCT GGCCT CCGGC GCGTA CTGCG TTAAG GAGTA 

850 860 87.0 880 f B ^00 

TGTCG TGGCG ACAGT CCTCG TGATG GAGCT GACCG CCTAT TCCGT CCACC GCTGG ATTAT 
ACAGC ACCGC TGTCA GGAGC ACTAC CTCGA CTGGC GGATA AGGCA GGTGG CGACC TAATA 

* 920 930 940 950 960 

GCACG GCCCC CTAGG CTGGG GCTGG CACAA GTCCC ATCAC GAAGA GCACG ACCAC GCGTT 
CGTGC CGGGG GATCC GACCC CGACC GTGTT CAGGG TAGTG CTTCT CGTGC TGGTG CGCAA 

570 980 990 1000 1010 1020 

GGAGA AGAAC GACCT CTACG GCGTC GTCTT CGCGG TGCTG GCGAC GATCC TCTTC- ACCGT 
CCTCT TCTTG CTGGA GATGC CGCAG CAGAA GCGCC ACGAC CGCTG CTAGG AGAAG TGGCA 

1030 1040 1050 1060 1070 1080 

GGGCG CCTAT TGGTG GCCGG TGCTG TGGTG GATCG CCCTG GGCAT GACGG TCTAT GGGTT 
CCCGC GGATA ACCAC CGGCC ACGAC ACCAC CTAGC GGGAC CCGTA CTGCC AGATA CCCAA 

1090 1100 1110 1120 1130 1140 

GATCT ATTTC ATCCT GCACG ACGGG CTTGT GCATC AACGC TGGCC GTTTC GGTAT ATTCC 
CTAGA TAAAG TAGGA CGTGC TGCCC GAACA CGTAG. TTGCG ACCGG CAAAG CCATA TAAGG 

1150 1160 1170 1180 1190 1200 

GCGGC GGGGC TATTT CCGCA GGCTC TACCA AGCTC ATCGC CTGCA CCACG CGGTC GAGGG 
CGCCG CCCCG ATAAA GGCGT CCGAG ATGGT TCGAG TAGCG GACGT GGTGC GCCAG CTCCC 

1210 1220 1230 1240 1250 1260 

GCGGG ACCAC TGCGT CAGCT TCGGC TTCAT CTATG CCCCA CCCGT GGACA AGCTG AAGCA 
CGCCC TGGTG ACGCA GTCGA AGCCG AAGTA GATAC GGGGT GGGCA CCTGT TCGAC TTCiST 

12^70 1280 1290 1300 1310 1320 

GGATC TGAAG CGGTC GGGTG TCCTG CGCCC CCAGG ACGAG CGTCC GTCGT GATCT CTGAT 
CCTAG ACTTC GCCAG CCCAC AGGAC GCGGG GGTCC TGCTC GCAGG CAGCA CTAGA GACTA 

1330 1340 1350 1360 13"^^ 1380 

CCCGG CGTGG CCGCA TGAAA TCCGA CGTGC TGCTG GCAGG GGCCG GCCTT GCCAA CGGAC 
GGGCC GCACC GGCGT ACTTT AGGCT GCACG ACGAC CGTCC CCGGC CGGAA CGGTT GCCTG 

1390 1400 1410 1420 1430 1440 

TGATC GCGCT GGCGA TCCGC AAGGC GCGGC CCGAC CTTCG CGTGC TGCTG CTGGA CCGTG 
ACTAG CGCGA CCGCT AGGCG TTCCG CGCCG GGCTG .GAAGC GCACG ACGAC GACCT GGCAC 



FIG. 17 



65 



EF C : 36 137 



1^50 1460 1470 1480 1490 1500 

CGGCG GGCGC CTCGG ACGGG CATAC TTGGT CCTGC CACGA CACCG ATTTG GCGCC GCACT 

GCCGC CCGCG GAGCC TGCCC GTATG AACCA GGACG GTGCT GTGGC TAAAC CGC6G CGTGA 

1510 . , 1520 1530 1540 1550 1560 

GGCTG GACCG CCTGA AGCCG ATCAG GCGTG GCGAC TGGCC CGATC AG6AG GTGCG GTTCC 

CCGAC CTGGC GGACT TCGGC TAGTC CGCAC CGCTG ACCG6 GCTAG TCCTC CACGC CAAGG 

1570 1580 1590 1600 1610 1620 

CAGAC CATTC GCGAA GGCTC CGGGC CGGAT ATGGC TCGAT CGACG GGCGG GGGCT GATGC 

GTCTG GTAAG CGCTT CCGAG GCCCG GCCTA TACCG AGCTA GCTGC CCGCC CCCGA CTACG 

1631 

GTGCG GTGAC C 
CACGC CACTG G 
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